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In this Issue 




Light-emitting diodes bright enough for outdoor applications in bright sunlight — 
automobile tail lights, for example — have been a long-sought goal of LED re- 
search. HP's latest LEDs, described in the article on page 6, should meet the 
needs of many outdoor applications. Made from aluminum indium gallium phos- 
phide (AllnGaP), they surpass the brightness of any previously available visible 
LEDs and come in a range of colors from red-orange to green. Technically, they 
are double-heterostructure LEDs on an absorbing substrate and are grown by 
means of a technique called organometallic vapor phase epitaxy, which has 
been used for producing semiconductor laser diodes but not for the mass pro- 
duction of LEDs. In addition to the technical details of the new LEDs, the article 
provides a history of LED material and structure development. 

Let's say you have a computing network in which users need to share resources. A user needs to move a 
compute job to a remote machine to free local compute cycles or access remote applications. You would 
like your computers to be equally loaded, and you would like to make remote access as automated as 
possible. Also, you want disabled machines to be automatically avoided. HP Task Broker (see page 15) is 
a software tool that distributes applications among servers efficiently and transparently. When a user 
requests an application or service, HP Task Broker sends a message to all servers, requesting bids for 
providing the service requested. Each server returns its "affinity value," or bid, for the service, and the 
server with the highest value is selected. Tasks are distributed at the application level rather than the 
procedure level, so no modifications are required to any application. Besides load balancing and increased 
availability, the benefits of HP Task Broker include multiple-vendor interoperability, easier network 
upgradability, and reduced costs. 

Real-time systems, unlike timesharing and batch systems, must respond rapidly to real-world events and 
therefore require special algorithms to manage system resources. The HP-RT operating system is the 
result of porting an existing operating system to the HP 9000 Model 742rt board-level real-time computer. 
The HP-RT kernel implementation, including the concepts of threads, counting semaphores, and priority- 
inheritance semaphores, is described in the article on page 23. The article on page 31 discusses the 
handling of interrupts in HP-RT and tells how the HP PA-RISC architecture of the Model 742rt affected 
the operating system design. 

The HP Tsutsuji logic synthesis system (page 38) takes logic designs expressed as block diagrams and 
transforms them into netlist files that gate-array manufacturers can use to produce application-specific 
integrated circuits (ASICs). In many applications, the system reduces the time required to design an ASIC 
by a factor of ten or more. Tsutsuji was developed jointly by HP Laboratories and the Yokogawa-Hewlett- 
Packard Design Systems Laboratory in Kurume, Japan. Because the World Azalea Congress was being 
held in Kurume when the project began, Tsutsuji — the Japanese word for azalea — was chosen as the 
name of the system. Currently, Tsutsuji is only being marketed in Japan. The article covers its architecture, 
its operation, and several applications. 
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A desktop scanner digitizes photographs, documents, drawings, and three-dimensional objects and 
sends the information to a computer, usually for electronic publishing applications. The HP ScanJet lie 
scanner (page 52) is a 400-dot-per-inch flatbed scanner that has black and white, color, and optical 
character recognition capabilities. Using an HP-developed color separator design, it provides fast, 
single-scan, 24-bit color image scanning. The article describes the color separator design and discusses 
the challenge of trying to duplicate human vision so that colors look the same in all media. 

Issues in the design of a workstation computer for industrial automation applications include serviceability, 
input/output capabilities, support, reliability, graphics, front-to-back reversibility, mounting options, form 
factor, airflow management, acoustics, and modularity, How these issues are addressed by the mechanical 
design of the HP 9000 Models 745i and 747i entry-level industrial workstations is the subject of the article 
on page 62. 

Franco Canestri is an application and technical support specialist for HP cardiology products in Europe. 
He also continues the medical laser research he began as an assistant fellow at the National Cancer 
Institute of Milan, focusing on orthopedic surgery applications. In the paper on page 68, he describes 
recent work on an algorithm for real-time surgical laser beam control using HP 9000 computers. 

The final three papers in this issue are from the 1992 HP Software Engineering Productivity Conference. 

► On page 73 is a description of a defect management system created for software and firmware devel- 
opment at two HP divisions. The system uses a commercial relational database management system. 

► The C++ language and object-oriented programming offer potential productivity gains, including code 
reuse, but there can be pitfalls. The article on page 85 discusses these as well as some new features of 
the language. > In developing real-time software, it may be difficult to go from a structured analysis 
model to a structured design. To help make this transition for HP medical ultrasound software, one HP 
division used a high-level design methodology called ADARTS. It's discussed on page 90. 

R.P. Dolan 
Editor 



Cover 

This photograph illustrates many of the features of the new HP AllnGaP light-emitting diodes, including 
their range of colors, their package types, their narrow-beam light output, and their brightness when 
viewed head-on. Although we took the picture in the dark, the main applications are daylight-viewable 
displays and automotive lighting. 



What's Ahead 

Featured in the October issue will be the design of the HP 54720 sampling digitizing oscilloscope family, 
which offers sample rates up to 8 gigasamples per second and bandwidths from 500 megahertz to 2 
gigahertz, the HP E1430A 10-megahertz analog-to-digital converter module, which has 110-dB linearity 
and built-in memory and filter systems, and the HP 4396A 1.8-gigahertz vector network and spectrum 
analyzer, a combination analyzer with laboratory-quality performance in all functions. 
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High-Efficiency Aluminum Indium 
Gallium Phosphide Light-Emitting 
Diodes 



These devices span the color range from red-orange to green and have the 
highest luminous performance of any visible LED to date. They are 
produced by organometallic vapor phase epitaxy. 

by Robert M. Fletcher, Chihping Kuo, Timothy D. Osentowski, Jiaim Gwo Yu, and Virginia M. Bobbins 



Since light-emilling diodes (LEDs) were first introduced 
Commercially in the lale 1960s, they have become a common 
component in virtually every type of consumer and indus- 
trial electronic product. LBDs are used in digital and alpha- 
numeric- displays, bar-graph displays, and simple on/off sta- 
tus indicators. Because of their limited brightness. LEDs 
have tended to "wash out" Qildef sunlight conditions and 
have not generally been used for outdoor applications. I Re- 
call the quick demise of digital watches with LED displays in 
the early l!)7l)s.) However, the introduction of bright red- 
light-emitting AlGaAs LEDs in the mid and late 1080s par- 
tially eliminated this drawback. Now. another family of 
LEDs, made from AUnGaP, has been introduced. These 
I.EDs surpass the brightness of any previous visible LEDs 
and span the color range from red-orange to green. With this 
breakthrough in brightness in a broad range of colors, we 
should see a wide variety of new applications for LEDs 
within the next decade. 

History 

Although the various LED display and lamp packages are 
f;uniliar to many (for example, the usual LED single-lamp 
package with its hemispherical plastic dome, or the seven- 
segment digital display package), the diversity of materials 
used in the chips that go into these packages is not as famil- 
iar. Fig. 1 summarizes the various semiconductor materials 
used in LEDs and Charts the evolution of the technology 
over the past 25 years. In the figure, luminous performance, 
measured in lumens* of visible light output per watt of elec- 
trical power input, is plotted over lime stalling from 1968 
and projected into the mid-lHOOs. 

The first commercial LEDs produced in the late 1000s were 
simple p-n homojunction devices made by diffusing Zn into 
GaAsP epitaxial material grown by v apor phase epitaxy on a 
QaAs substrate. 1 GaAsP is a direci-bandgap semiconduclor 
for compositions where the phosphorus-to-arsenic ratio in 
the crystal lattice is 0.0 to 0.4. Above O.-J, the bandgap be- 
comes indirect.* 1 ' The composition of (j0% As and 4096 F 
produces red near-bandgap light at about 050 tun. Quantum 
efficiency in a simple homojunction device such as litis is 

' A lumen is a measure ol visible light flux that lakes into account the wavelength sensitivity at 
Ihe human eye An LEO'S output m lumens is obtained by multiplying the radian! Ilux output ot 
the LED m waits by the eye's sensitivity as defined hy the Commission Internationale de 
IFclairagelCIEI 



low. bul these so-called "standard red" LEDs were and slill 
are inexpensive and relatively easy to produce. The red 
numeric displays in Ihe first pocket calculators were made 
of standard red LEDs. 

At around the same lime. GaP epitaxial layers doped with 
zinc and oxygen and grown on GaPsuhsl rales by liquid phase 
epitaxy were introduced. The GaP substrate, unlike GaAs. is 
transparent to Ihe emitted light, allowing these devices to be 
more efficient than Ihe GaAsP Standard red diodes. How- 
ever. Ihe emission wavelength at TOO nm is near the edge of 
the visible spectrum, which limits their usefulness. 

A minor breakthrough in LED performance came in the 
early 1070s with the addition of nilrogen to GaAsP and GaP 
epitaxial materials. 2 ' 3 ' 4 Nitrogen in these semiconductors is 
not a charge dopant; rather it forms an isoeleclronic impurity 
level in the bandgap which behaves as ;m efficient radiative 
recombination center for electrons and holes. In this way, 
even indireet-bandgap GaP mid indirect compositions of 

' In a direci-bandgap semiconduclor. the recombination ot electrons and holes has a high 
probability ot occurring through a banfl-to-band radiative process in which a photon is 
emitted. In an induecl-bandgap semiconduclor. radiative bandloband recombination re- 
quires ihe interaction ol a lattice vibration la phononl with Ihe electron and hole For this 
interaction the probability is low. and consequently nonradiative recombination processes 
dominate 
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Fig. l. Turn - evolution oflight«eraMng fflode wctuiologyi 



6 August l!i!i:i Hewlett Packard .loiinial 

©Copr. 1949-1998 Hewlett-Packard Co. 



GaAsP can be made to emil sub-bandgap liglit efficiently By 
the mid-1970s, orange and yellow LEDs made from various 
alloys of GaAsP and green LEDs made from GaP appeared 
on the market. 

The next breakthrough occurred almost a decade later with 
the introduction of AlGaAs red-light-emitting LEDs, grown 
by liquid phase epitaxy. These provided two to ten limes the 
light output performance of red GaAsP."'' The reason for the 
range of performance of AlGaAs is t hat it can be produced 
in various structural forms: a single heterostniclure on an 
absorbing substrate (SH AS AlGaAs). a double heterostnic- 
lure on an absorbing substrate (DH AS AlGaAs), and a 
double heterostructure on a transparent substrate ( DH TS 
AlGaAs). (See page 8 for an explanation of heterostrne- 
tures. ) This was an important milestone in LED technology 
because for the first lime LEDs could begin to compete with 
incandescent lamps in outdoor applications such as automo- 
bile tail lights, moving message panels, and other applica- 
tions requiting high Bus output. Included in Fig. 1 is the flux 
required for a red automobile tail light, which is well within 
the performance range of AlGaAs LEDs. rnfortunately, 
AlGaAs LEDs can efficiently emit only red (or infrared) 
light, which makes them unsuitable for many applications. 

The latest technology advance, and the subject of ihis paper, 
is the development of AlInGaP double-helerostrucliire 
LEDs. These devices span the color range from red-orange 
to green at light output performance levels comparable to or 
exceeding those of AS and TS AlGaAs. 7 8 The AlInGaP male- 
rials are grown by a technique called organometallic vapor 
phase epitaxy. This growth technology has been used for the 
production Of Optoelectronic semiconductors, especially laser 
diodes, for a number of years, bul it has not been previously 
used for the mass production of LEDs. 

Hew lett-Packard's AlInGaP devices currently being intro- 
duced to the market have the highest luminous performance 
of any v isible LED to date. As the technology matures 
through the 1990S, performance levels are expected to in- 
crease further and reach into the lens-of-hiniens-per-wall 
range. 

Properties of AlInGaP 

'lite bandgap properties of several compound semiconductors 
used iii LED technology are shown in Fig. 2. Illustrated is 
the bandgap energy as a function of crystal lattice constant. 
In a diagram such as this, binary compound semiconductors, 
such as ( iaP and IuP. are plotted as single points, each with 
a unique bandgap and lattice constant. Ternary compounds, 
such as AlGaAs, are represented by a line drawn between 
the two constituent binary compounds, in this case AlAs and 
GaAs. Finally, quaternary compounds, such as AlInGaP, are 
represented by an enclosed region with the constituent 
binary compounds ai the veilices. The complex nature of 
the crystal band structure and the transition from a direel- 
bandgap semiconductor to an indirect -bandgap Semiconduc- 
tor are what give the enclosed region its characteristic 
shape. Properties such as this are usually obtained from 
both experiment and theory. 

This type of diagram is useful for designing LED materials 
for at least two reasons. First, it shows what composil ions 
of AlInGaP are direct -bandgap and therefore readily useful 
for making efficient LEDs. Second, for high-quality epitaxial 
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growth it is necessary for the epitaxial layers to have the 
same lattice constant as the substrate on which they are 
grown. This diagram shows what compositions of AlInGaP 
will provide this lattice matching condition for a given sub- 
strate. For visible LEDs, the two common substrates used 
are GaAs and GaP. Clearly (iaP is not immediately useful 
here because it is at the indirect-bandgap end of the 
AlInGaP composition region. This leaves GaAs as the only 
suitable substrate. A vertical line drawn from the x axis 
through the GaAs point intersects the AlInGaP region and 
indicates the compositions that lattice match to a GaAs 
substrate. The composition that gives Ihis lattice match 
condition is written as: 

(AJ s Gai_ x ),).r,In„.r,P. 

This notation, which is typical for describing compound 
semiconductors, indicates the proportions of the constituent 
atoms within the crystal latlice In this case, half the group 
III atoms are indium and the other half are some mixture of 
aluminum and gallium. By coincidence, aluminum and gal- 
lium have approximately the same atomic -size within the 
latlice. As long as the aniounl of indium remains fixed at 0.5, 

the aluminum-gallium mix can vary continuously, from all 

aluminum to all gallium, and the latlice constant will not 
change appreciably. What will change is the bandgap of the 
material. If the aluminum is kepi below x = 0.7, the band- 
gap is direct; above values of x = 0.7. the bandgap becomes 
indirect. This case is illustrated in Fig. 2 where the line of 
latlice match crosses from the direel region into the indirect 
region. 

The bandgap diagram indicates the potential of a material 
for making LEDs. that is. whether a material has a direct 
bandgap and whether the bandgap energy is within the 
proper range for producing visible photons. The actual per- 
formance of a device depends on a number of additional 
factors. First, the growth of high-quality epitaxial material 
iniisi be possible. Ideally, the growth should lake place on a 
commonly available, inexpensive substrale and should be 
lattice matched lo that substrale. Second, it must be pos- 
sible to form a p-n junction in the material. Third, to obtain 
the highest quantum efficiency, it should be possible lo grow 
a double heteroslmeture. In the case of AlInGaP, all Hirer tjt 
these conditions are satisfied. 
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The Structure of LEDs: Homojunctions and Heterojunctions 
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Light-emitting diodes come in a variety of types, differing in materials and in 
epitaxial structure. GaAsP and GaP are used fot the majority of red. orange, yellow, 
and green LEDs currently in use All these LEDs are homo|unction p-n diodes with 
either diffused lunctions or junctions grownin during the epitaxial process Fig I 
shows a cross section of a typical GaAsP homojunction chip. In other material 
systems, such as AIGaAs and AllnGaP, it is possible to grow layers of different 
compositions Iheterostructures) and therefore different bandgaps while keeping 
the lattice constant the same in all the layers. This capability means that more 
complex and efficient LED structures can be grown with these materials 

Fig 2 illustrates an AIGaAs single-heterostructure |SH) chip The epitaxial part of 
the device consists of an n-type active layer where the light is generated, and a 
single p-type window layer on top. The composition of the window layer is chosen 
to have a significantly larger bandgap than the active layer, and as such it is trans- 
parent to the light generated in the active layer (hence the name window layer) 
The single heteroiunction (excluding the one with the substrate), which in this 
case is also the p-n junction, is what defines this as a single-heterostructure 
device. The efficiency increase is a result of the transparency of the window layer 
and increased injection efficiency at the p-n heteroiunction 

A modification of the single heterostructure is the double heterostructure IDHI 
shown in Fig. 3. again using AIGaAs as an example In this case an additional 
layer is grown between the active layer and the substrate. In a double hetero- 
structure. the two high-bandgap layers surrounding the active layer are referred to 
as confining layers Together they act to confine electrons and holes within the 
active layer where they recombine radiatively The tower confining layer efficiently 
injects electrons into the active layer and helps channel some of the light out of 
the chip, while the upper confining layer acts as a window for the generated light 
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Fig. 2. AIGaAs single-heterostructure LEO on an absorbing GaAs substrare 
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Fig. 1 AIGaAs double-heterostructure LED on an absorbing substrate 



OMVPE Growth of AllnGaP 

AllnGaP and its related compounds GalnP and AllnP have 
been the subject of study since the 1960s. Only within the 
last eight years, however, have researchers been able to 
grow AllnGaP controllably and with high quality. Double- 
heterostructure AllnGaP semiconductor lasers that have a 
GalnP active layer have been commercially available for at 
least five years. 'Die development of tec hniques for produc- 
ing AllnGaP LEDs has been slower because of the greater 



epitaxial layer thicknesses required and because of the 
larger quantities needed to supply market demand. Also, 
high-performance LEDs require higher-quality epitaxial 
growth than semiconductor lasers. This is because LEDs 
generally operate at much lower current densities than 
semiconductor lasers (tens of amperes per square centime- 
ter versus hundreds or thousands of amperes per square 
centimeter), and nonracliative defects can dominate the 
recombination process. 
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fig. 4. AlGaAs double-heterostructure LED on a transparent substrate 

II the upoet confining layer is grown especially thick, it can act as a mechanical 
"substrate." and the original absorbing GaAs substrate can be removed by chemi- 
cal etching This is a transparent-substrate double-heterostructure (TS DH| device 
and is shown in Fig 4. In Fig 4 the chip is turned upside down so that the thick 
AlGaAs confining layer is on the bottom This is the most efficient type of LED 
chip, with external efficiencies approaching 15% for red AlGaAs lamps. 

Finally, there is the AllnGaP LED structure. This device is shown in Fig 5 It 
resembles the AlGaAs double heterostructure except for the presence of the GaP 
window layer In the case of AlGaAs, Uie upper confining layer can be grown many 



n Contacl 

Rg. 5. AllnGaP double-heterostructure LED on an absorbing substrate 

micrometers thick, enough to couple light out of the chip efficiently. With AllnP, 
however, for epitaxial growth reasons it is not possible to produce a thick enough 
layer of high-quality AllnGaP to act as an efficient window, or even to spread the 
current effectively to the edges of the chip By growing a thick GaP layer on top of 
the active device structure, an efficient window is produced and the sheet resistance 
of the p layers is reduced enough to promote adequate current spreading 



Vapor phase epitaxy (VPE) and liquid phase epitaxy (LPE) 
arc the commonly used techniques lor the mass product ion 
of LED materials. GaAsP is best grown using the VPE 
method, and AlGaAs and GaP are grown using the LPE 
method. Neither ol" these techniques works well for the 
growth of AllnGaP. A third technique called organometallic 
vapor phase epitaxy (OMVPE) does work well. OMVPE is 
similar to conventional VPE in which (he readmit materials 
are transported in vapor form to the healed substrate where 
the epitaxial growth takes place. The main difference is that 
instead ol' using metallic chlorides as the source materials 
(Gat 'l.i or In* l.i. for example). OMVPE uses orgaitometallic 
molecules. The materials used in the cilsc of AllnGaP are 
trinieihvlaluminuni. Irimethylgalliuni. anil trimethylindium. 
Other similar organometallic compounds are sometimes 
used as well. As in VPE, phosphine gas is used as the source 
of phosphorus, li.v controlling the ratio of constituent gases 
within the reactor, virtually any composition of AllnGaP 
can be grown. The reactor is designed in such a way that 
the thicknesses of the epitaxial layers can be precisely 
controlled. 

The schematic diagram in r'ig. :t shows a typical research- 
scale OMVPIO reactor. In Ihis example, the substrate sits Hal 
on a horizontal graphite slab inside a quartz lube. Outside 
the lube and surrounding the graphite is a metal coil con 
DfiCted lo a imillikilowalt radio frequency generator. The 
graphite is heated to around 70(1 to SOI) ( ' by HV indue) ion. 

There are many variations on Ihe design of the reactor 
Chamber. For example, in some exisling commercial 



OMVPE systems, Ihe wafers sit on a horizontal platter and 
rotate either slowly or al high speed lo achieve uniform 
growth across Ihe wafer. ( Rher systems use a barrel-type 
SUSceptor inside a large bell jar, similar lo VPE arid silicon 
epitaxy reactors. The method for heating Ihe substrates can 
be RF induction, resistance heaters, or infrared lamps. 
Whatever the configuration, the conceptual nature of the 
growth process remains essentially the same. 

The organometallic sources under normal room temperature 

conditions are either high-purity liquids or crystalline solids 
and are contained in small stainless-si eel cylinders measur- 
ing about eighl inches long by two inches in diameter. (Be- 
cause (hey are pyrophoric, these materials are never ex- 
posed lo air and require careful handling.) The cylinders are 
equipped with an inlel porl connected lo a dip tube, and an 
exil porl. Hydrogen gas Bowing Ihrough the dip tube and up 
through Ihe organomelallie liquid or solid becomes saturated 
with organomelallie vapors. (This type of container is com- 
monly called a "bubbler," referring to the action of the hydro- 
gen bubbling through the liquid. ) The mixture of hydrogen 
and vapor Hows out of Ihe cylinder and to Ihe reactor cham- 
ber. The exact amount of organometallic vapor transported 
to i lie reactor is controlled by Ihe temperature of (he bubbler, 
which determines the vapor pressure of the organomelallie 
material, and by the flow of hydrogen. The temperature of 
Ihe bubblers is controlled by immersion in a fluid bath in 
which Ihe temperature is regulated within ±0.1 °C or better. 
Special regulators called mass How controllers precisely 
meter Ihe How of hydrogen loeach bubbler. 
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Fig. 8. Simplified schemath dia 
gram of an iirganuiiietallic vapor 
phase cpiiaxy (ovMi'ici reactor. 



At the enlrance to the real tor chamber, the reactaiil gases 
are mixed. These gases consist of phosphine, a mixture of 
hydrogen and the organonietallic vapors, dopant gases, and 
additional hydrogen added as a diluent As the gases pass 
over the hot .substrate, decomposition of the phosphine. 
organometallics, and dopanl s res occurs. If all t he condi- 
tions are correct, proper crystal growth takes place in an 
orderly atomic layer-by-layer process. Hydrogen, unreacted 
phosphine and organometallics, and reaction by-products 
such as methane are t hen drawn out of the reactor and 
through the vacuum pump for treatment as toxic exhaust 
waste, 

The growth of ffl-V epitaxial materials is typically complex, 
and the successful production of high-quality films is depen- 
dent on many factors. The growl h of AlInGaP is definitely 
no exception. Since this is a quaternary material system and 
is not automatically lattice matched to the substrate (unlike 
AltiaAs), the composition of the crystal lattice must be care- 
fully controlled during the growth process. This means that 
each layer in the double heterostructure has to have the 
proper proportions of aluminum, indium, and gallium. Fur- 
thermore, the transition from one layer composition to the 
next often requires special consideration to avoid introduc- 
ing defects into the lattice. Other factors, such as substrate 
temperature, total gas flow through the reactor, and dopant 
concentrations require careful optimization to achieve the 
best final device properties. Even after years of research 
with OMVPE, there is still a certain amount of art involved 
in its practice. 

AlInGaP Device Structure 

As mentioned previously, the high-efficiency AlInGaP LED is 
a doiible-heterostincture device. Fig. 4 shows a cross-section 
of a Hewlett-Packard LED with the individual epitaxial layers 
revealed. The light-producing part of the Structure consists 
of a lower confining layer of n-type AllnP, a nominally un- 
doped AlInGaP active layer, and an upper confining layer of 
p-type AllnP. Light is generated in the active layer through 
the recombination of carriers injected from the p-n junction. 
The confining layers enhance minority carrier injection and 
spatially confine the electrons and holes within the active 



layer, increasing the probability for band-to-band recombi- 
nation. For such a struc ture, the internal quantum efficiency 
(number of radiative recombinations per total number of 
recombinations) can be very high, even approaching 100% 
for the best -quality materials. 

On top of the double heterostructure is grown another layer, 
which serves two functions. First, it reduces the sheet resis- 
tance of the p-type layers, promoting current spreading 
throughout the chip, and second, it acts as a window layer 
to enhance coupling of the light out of the chip. Early in the 
development phase of the AlInGaP LEDs it was discovered 
that the thin upper confining layer of AllnP, ideal for confin- 
ing electrons and holes in the active layer, is resistive and by 
itself prevents current from the central ohmic contact 
(shown in Fig. 4 ) from spreading out to the edges of the 
chip. In fact, with only AllnP as the lop layer, virtually all of 
the Current flows straight down, anil light generation occurs 
only beneath the contact and is blocked from escaping the 
chip by the Contact itself. With the addition of a thick con- 
ductive window, such as GaP. the current is able to spread 
out. iukI light generation occurs across the entire chip. Addi- 
tionally, because the index of refraction of semiconductors 
is high (typically mound :?.5), without the window much of 
the light produced is trapped inside the chip by total internal 
reflection and is eventually absorbed by the substrate. Using 
Snell's law and geometric optics, it can be shown that the 
thick window layer increases the amount of light that can 
escape the chip by a factor of three.'-' 

Conceptually, any transparent and conductive epitaxial ma- 
terial could Serve as the window material. From a practical 
standpoint, however, there are few epitaxial materials that 
can be grown on the AUnGaP layers that satisfy the require- 
ments of transparency and electrical conductivity. The iwo 
best materials are AIGaAs and GaP. AlGaAs is a lattice 
matched material with good epitaxial growih characl eristics 
and acceptable conductivity. However, it is transparent only 
in tin' red and orange spectral range. At wavelengths below 
about till) nm. AIGaAs begins to absorb significantly GaP, on 
the other hand, although mismatched to the AUnGaP lattice 
by -1%, is highly conductive and transparent in the spectral 
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region from red to green, which is perfect for the spectral 
range of AHnGaP. 

From an epitaxial standpoint, the successful abrupt growth 
of lattice mismatched GaP on an AlInGaP heterostructure is 
an interesting phenomenon. Normally, one would not expect 
GaP to grow as a single crystal layer directly on a mis- 
matched "substrate" such as an AllnGaP heterostructure. It 
usually takes special growth techniques, such as alloy grad- 
ing from one composition to the other to achieve a gradual 
change from the substrate lattice constant to that of the de- 
sired layer. (This is the common technique used forGaAsP 
epitaxy on GaAs and GaP substrates. The grading fakes 
place over a distance of tens of micrometers of epitaxial 
material. ) We have developed a technique for growing the 
GaP window directly on the AllnGaP heterostructure. The 
GaP at the interface with the AllnP contains a dense net- 
work of crystal defects (dislocations) caused by the lattice 
mismatch. The defect-rich layer is only a few hundred nano- 
meters thick. It appears to have no effect on the transpar- 
ency or conductivity of the window and the defects do not 
propagate down into the high-quality heterostructure where 
the light is generated. 

Instead of growing the thick GaP window using the OMVPE 
technique, after the heterostructure growth is completed the 
wafers are removed from the OMVPE reactor and trans- 
ferred to a conventional hvdride VPE reactor where a 
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45-micrometer layer of GaP is deposited to complete the 
structure. The reason for the two-step growth process is to 
save time and cost. Organometallic sources are expensive, 
whereas hydride VPE requires only metallic gallium as a 
source. Also, the crystal growth rate using VPE can easily be 
ten rimes higher than with OMVPE. which is desirable for 
the growth of thick layers. 

Device Fabrication 

The fabrication of LED chips is relatively simple compared 
to IC chip technologies. There is generally no high-resolution 
photolithography involved, and often there is no multilayer 
processing. The main problems arise because of the inherent 
difficulties in working with ffl-V semiconductor materials. 
These processes are notorious for working one day and not 
working the next, often without a clear explanation for the 
change. Processing operations, such as prenietallization 
cleaning, metal etching, contact alloying conditions, and 
dicing-saw cut quality are constantly monitored and adjusted 
for optimum device performance. 

In its simplest form, the process for making AllnGaP chips 
involves a metallization for the anode front contact patient 
( usually a circular dot with or without fingers to promote 
current spreading), mechanical and/or chemical thinning of 
the wafer to achieve the proper die thickness, metallization 
on the back of the substrate for the cathode contact, and 
sawing the wafer into individual dice. The dice are assem- 
bled into the various lamp or display packages using auto- 
mated pick-and-place machines. Conductive silver epoxy is 
used to attach the die to its leadframe, and gold-wire ther- 
mosonic bonding is used to bond to the top dot contact. In 
the case of a lamp package, the manufacturing process is 
completed by casting an epoxy dome around the leadframe. 
A cross-sectional view of a chip in a lamp package is shown 
in Fig. 4. Every device Ls tested to check the electrical char- 
acteristics, including the forward voltage at a specified cur- 
rent (usually 20 niA) and the reverse breakdown voltage at a 
specified current (usually -50 uA). < >ptical performance is 
also measured to check for light output flux, on-axis intensity, 
and dominant wavelength. 

AllnGaP Performance 

The operating characteristics of AllnGaP devices have al- 
ready been briefly described, especially their high light out- 
put performance compared to other technologies. A more 
detailed analysis of AllnGaP performance is shown in Figs. 5 
and 0. Fig. 5 shows the external quantum efficiency for 
AllnGaP T-l' i lamps as a function of emission wavelength 
front about 555 inn to 025 mil. (These LEDs have the same 
doitble-helerostnicture configuration except for the com- 
position of the active layer which is adjusted to vary the 
emission wavelength.) < ither types of T-l % LED lamps are 
included for comparison. Drive current is 20 niA in all cases. 
External quantum efficiency is a measure of the number of 
photons emitted from the device per electron crossing the 
p-n junction and is dependent on I he efficiency of the semi- 
conductor device at producing photons (the internal quantum 
efficiency) and on the ability to gel those photons mil of the 
Chip and 0U1 of the lamp package (package efficiency). If 
every eleclron-hole pair produced a photon and every photon 
were extracted from the device and measured, the external 
quantum efficiency would be 100%. 
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Fig. 5. External quantum efficiency ofT-l ' < AllnGaP lamps i-< mtpan-0 
to other technologies, Also shown is the < !IE human-eye response 

curve. 

Internal (|iiantnni efficiency is limited by the Crystalline qual- 
ity of the semiconductor, l>y the bandgap properties or the 
semiconductor, and hy the device strut-Mire (honiojunction 
or hetero.j unction). In the spectral range between f>25 and 
600 nin, the efficiency is almost flat. Here, crystalline quality 
is good, and the bandgap of the active layer is direct and 
well away from the indirect crossover. Also, the bandgap 
difference between the active layer and the upper and lower 
confining layers is large, providing adequate trapping of 
electrons and holes within the active layer and efficient 
radiative recombination. 

As the wavelength is reduced by increasing the aluminum* 
to-gallitini ratio in the active layer, several effects begin to 
lower the overall internal quantum efficiency. First, as the 
direct/indirect -bandgap crossover is approached, there is a 
greater probability for indirect-bandgap nonradiative transi- 
tions. This effect increases dramatically as the wavelength is 
reduced. Second, because aluminum is such a highly reac- 
tive atomic species, it has the tendency to bring undesirable 
contaminants, especially oxygen, into the crystal lattice with 
it. These impurities act as nonradiative recombination cen- 
ters for electrons and holes. Consequently, as the proportion 
of aluminum in Ihe active layer is increased to reduce the 
emission wavelength, more nonradiative recombination oc- 
curs. Finally, as the bandgap of the active layer is increased, 
the upper and lower confining layers become less efficient 
at keeping electrons and holes contained within the active 
layer before they recombine, 

The relative importance of these three effects is still being 
investigated. Models describing direct/indirect-bandgap 
effects, defect-related nonradiative recombination, and con- 
fining layer efficiency exist. However, these models are de- 
pendent on an accurate knowledge of the bandgap of the 
material. For AllnGaP, there is still uncertainty aboul the 
exact bandgap properties, notably the exact location Of the 
direct/indirect crossover. It is commonly believed thai 
higher efficiencies at tin' short wavelengths should be 
achieved with improv ed epitaxial growth techniques] 
possibly by improving the purity of the organometallic 
source materials. 
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Fig. 6. LED luminous performance for Alltitiaf* compared to other 
technologies. Luminous performance is tin- product of power effi- 
i-iency (roughly equal to quantum efficiency, Fig. 5) and the eyeS 
response. 

Once the light is produced in Ihe active layer the task be- 
comes one of getting the light out of the chip. Because the 
index of refraction of semiconductors such as AllnGaP is 
high (n = 8.5, approximately), most of the generated light 
that strikes Ihe sidewalls of the chip is Mapped Within Ihe 
chip either because of total internal reflection or because of 
Fresnel reflection. In the case of an absorbing substrate chip, 
such as the present Allntlal" device, reflected rays generally 
are lost to absorption in the substrate. We have minimized 
the losses from total internal reflet-lion with Ihe addition of 
the thick GaP window layer. Nevertheless, even the best 
external quantum efficiency theoretically possible for a 
cubic-shaped douhlc-heterostruclure absorbing substrate 
Chip in air is only about 2%. 

The effects of total internal reflection and Fresnel reflection 
are mitigated by encapsulating the chip within clear epoxy 
plastic shaped with a hemispherical dome (Ihe typical LED 
lamp package configuration). The plastic acts as an index- 
matching medium belween Ihe semiconductor and the air, 
reducing Ihe effects of total Internal reflection and Fresnel 

reflection. The hemispherical shape of the plastic eliminates 
total internal reflection within the plastic itself and acts to 
focus the light from Ihe chip. Generally, the external quan- 
tum efficiency of an encapsulated chip is increased by a 
factor of three, bringing the theoretical maximum external 
quantum efficiency to between l>% and 7% for ;ui absorbing 
substrate chip. 

From Fig. 5 it can be seen thai al the longer wavelengths, 
the external quantum efficiency of AlIntiaF is about 
comparing favorably with absorbing substrate I'll AHiaAs at 
7%. I inly TS AIGaAs has a higher external quantum efficiency 
owing lo ihe lack of absorption by the substrate. All other 
LED materials are less efficient than Allntial 1 from (525 to 
555 nm. In the yellow-to-orange wavelength range, this 
difference is an order of magnitude or more. 

Included in Fig. 5 is the C1E relative eye sensitivity curve 
which shows that the eye is most sensitive to green photons 
and much less so to red photons. This curve is used to con- 
vert external quantum efficiency data to the luminous per- 
formance data in Fig. li. Fig. 6 shows lumens of visible lighl 
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emitted from I he LED lamp per wan of power applied to the 
diode (y axis) as a function of emission wavelength (x axis). 
This data is representative of how the eye actually responds 
to various types of LEDs. The effect of I he f'IE curve is to 
depress performance in the red pan of the spectrum, result- 
ing in a dramatic increase in apparent performance of (lie 
AllnGaP lamps compared to even TS AIGaAs lamps. It 
should be pointed out I hat I he* AlInGaP data shown in Figs. 5 
and G represents the best reported results, whereas the data 
for the other technologies shows typical production values. 
Production performance values for AllnGaP are not yet es- 
tablished. Initially the performance will be lower than the 
data shown here but is expected to increase and surpass 
this data as the technology evolves and matures. 

Also indicated in Fig. (i are the luminous performance levels 
for automotive incandescent lamps, both filtered and unfil- 
lered. These benchmarks are useful because of the interest 
in using LEDs instead of incandescent lamps for tail lights, 
brake lights, turn signals, and side marker lights on automo- 
biles and trucks. The high efficiency of AIGaAs and AllnGaP 
LEDs and their long lifetimes make them attractive alterna- 
tives lo incandescent light bulbs in the automotive Industry. 
Hecause LEDs can be assembled into a smaller package 
than an incandescenl bulb, automotive design can be more 
flexible and overall manufacturing costs lower. 

The reliability of AllnGaP LEI >s is generally good compared 
to other types of LEDs. Slress tests in which devicer ire 
driven at currents up to 50 mA at ambient temperatures 
ranging from -40 to +55?C show good light output and elec- 
trical Stability beyond 1000 hours. Since AllnGaP LEDs have 
not existed for very long, device lifetime data as long as 
10,000 hours is scarce. However, indications are that there 
are no inherent reliability problems associated specifically 
with AllnGaP. 

For some stress conditions. AllnGaP performs significantly 
better than other products. For example, in high-letnperature, 
high-humidity conditions, AIGaAs LEDs fail rapidly because 
of corrosion of the high-aluminuin-conienl epitaxial layers. 
Since the overall aluminum content of AllnGaP devices is 
less than for AIGaAs, this corrosion problem does not ap- 
pear, mid AllnGaP LEDs perform very well in high-humidity 
conditions. Also, it is well-known that standard yellow 
GaAsPLEDs exhibit serious light output degradation when 
operated at low temperatures. AllnGaP LEDs demonstrate 

excellent low-leniperalilie stability. 



Of course, good LED device performance and reliability do 
not happen automatically. There are many conditions that 
occur during the growth of the epitaxial material and during 
device processing that affect initial light output, electrical 
characteristics, and device longevity. In fact, many factors 
affecting performance are not completely understood at litis 
time. With ongoing analysis of the problems that oc cur, addi- 
tional insight into the properties of AllnGaP epitaxial growth 
and device design will follow. 

HP AllnGaP Products 

The proliferation of Alln< SaP chips into various LED packages 
will be an ongoing process over the next few years. Init ial 
market demands are for T-1 V* lamp packages for moving 
message signs, highway warning markers, and automotive 
and truck lighting applications. As of this writing, several 
AllnGaP lamp packages are available m three colors from 
amber to red-orange. These products are listed in Fig. 7. 

Conclusion 

We have attempted lo provide a general description and 
understanding of HP's new family of LEDs made from 
AllnGaP. We have compared the performance and production 
of AllnGaP devices with other LED technologies. We have 
also lined lo give the reader a general understanding of LEDs 
and the III-V processes necessary for I heir manufacture. 

HP's AllnGaP devices represent the brightest visible LEDs 
that have ever been made. Interest in them is quickly grow- 
ing as manufacturers come up with new applications for 
litem. Although comparably bright red AIGaAs LEDs have 
been available for several years, the appearance of bright 
orange and yellow lamps has made possible total LED re- 
placements in applications where low-watlage filament 
lamps have been used exclusiv ely. The benefits of LEDs 
include long lifetime, performance reliability under a broad 
range of operating conditions, and overall cost savings over 
traditional incandescenl lamps. 
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HP Task Broker: A Tool for 
Distributing Computational Tasks 



Intelligent distribution of computation tasks, collective computing, load 
balancing, and heterogeneity are some of the features provided in the 
Task Broker tool to help make existing hardware more efficient and 
software developers more productive. 

by Terrence P. Graf, Renato G. Assini, John M. Lewis. Edward J. Sharpe. James J. Turner, 
and Michael C. Ward 



HP Task Broker is a software tool that enables efficient 
distribution of computational tasks among heterogeneous 
computer systems running UN'IX*-system-based operating 
systems. Task Broker performs its computational distribu- 
tion without requiring any changes to the application. Task 
Broker relocates a. job and its data according to rules set up 
at Task Broker initialization. The oilier capabilities provided 
by Task Broker include: 

Load balancing. Task Broker can be used to balance the 
computation ' oa( ' among a group of computer systems. 
Since Task Broker has the ability to find the most available 
server for a compulation task transparen! ly. il can effec- 
tively level the load on a compute group, thus helping lo 
make existing hardware more efficient. 
Intelligent targeting. Task Broker can transparently targel 
specific servers mosl appropriate for a specialized task. For 
example, a graphics Simulation application may be more 
t'fliciciuly executed on a machine with a graphics accelera- 
tor or fast floating-poinl capability, These targeting charac- 
teristics can be buill into the Task Broker group definition 
without requiring the user lo have any machine-specific 
knowledge. Thus, expensive resources don'l need lo be 
duplicated in a network. 

Collective computing. Task Broker allows a network of 
Workstations lo form a compulalional cluster I hat can re- 
place a far more expensive mainframe or supercomputer. 
This approach oilers multiple advanlagcs over the single 
compute server model. Some of these advantages include 
increased availability (no single point of failure), improved 
scalability (ease of upgrade), and reduced costs. See "HP 
Task Broker and Computational Clusters." on page lti. 
Heterogeneity. Task Broker can be used to create a hetero- 
geneous cluster, allowing a network of machines from mul- 
tiple vendors to intemperate in a completely transparent 
fashion. Task Broker will run on several different work- 
station plal forms, all of which can inieroperale as servers 
and clients. 

DCE Interoperability. Task Broker is able to take advantage 
of many of the services provided by HP's DCK (Distributed 
Computing Environment ) developer's environment. See 
"Task Broker and DCE Interoperability," on page 1!). 

HP Task Broker runs on HP 9000 Series 300, 400. 600, 700, 
and S00 computers miming the HP I X' operating system, 
and the HP Apollo workstations DN2500, DN3500, DN4500, 



DN5500, and DN 10000 miming Domain/OS. In addition, 
Scientific Applications International Corporation (SAIC) has 
ported Task Broker to the Sun3. S\m4. and SPARCstation 
platforms. 

Automated Remote Access 

The need to access remote computer resources has existed 
ever since computers were tied together by local area net- 
works. Remote access gives the user a means of increasing 
productivity by allowing access to more powerful or special- 
ized computer resources. 

To access a remote resource, computer users have had lo 
rely on guesswork for determining optimal placement and 
have been saddled with the tedious activity of manually 
moving files to and from a resource. 

Task Broker effectively automates the manual tasks required 
for distributing computations by: 

• (lathering machine-specific knowledge from the end user 

• Analyzing machine-specific information and selecting the 
mosl available server 

• Connecting lo a selected server via telnet, remsh (remote 
shell), or crp (create remote process) 

• Copying program and dala files lo a selected server via ftp 
( file transfer protocol) or NFS ( Network File System ) 

• Invoking applications over the network 

• Copying Ihe resulting dala files back from the server via ftp 
or NFS. 

Each of the above steps is done automatically by Task Broker 
Without Ihe user needing to be aware of, or having lo deal 
with, the details of server selection and data movement 

Server selection is one of the most significant contributions 
provided by Task Broker. For the user lo determine the mosl 
appropriate server for a. job manually, all or Ihe dynamic 
variables of Server availability would have to be captured 
before every job submittal. Because this is a time-consuming, 
cumbersome process, developers t lying lo run a job would 
spend very little lime selecting an appropriate Server. 

Instead, developers would revert to using either their own 
machine for compute jobs orjusl a few popular machines, 
overloading those machines and underloading others. In 
addition, having to manage several network connections 
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HP Task Broker and Computational 
Clusters 

A computational cluster is a group of workstations networked together and used 
as a single virtual computational resource This notion is an extension of the Task 
Broker cluster concept, since it is based on the idea that a cluster of workstations 
can actually replace a mainframe 

The motivation behind this concept comes from customers who are downsizing 
from a single compute server, such as a mainframe or supercomputer, or customers 
who have computationally intensive tasks that can execute more effectively un a 
cluster of workstations 

The advantages of the computational cluster over the resource that it is intended 
to replace are several 

• The cluster can be considerably less expensive then a mainframe 

• The cluster is modular and therefore more easily upgradable. 

• The cluster can consist of workstations that may already exist in the environment 

Task Broker has an obvious role in this area of computing, since the computational 
cluster is really a special case of the Task Broker solution However, it is important 
to note that, in terms of distributing computations, only a portion of the mainframe 
replacement solution would be provided by Task Broker in its current form 

Task Broker represents the class of solutions that provide a mechanism for coarse- 
grained parallelism (i e . giving the user the ability to run multiple tasks or applica- 
tions in parallel) The goal of this type of solution is to achieve parallelism without 
impacting the application, or to maximize the use of hardware 

A finer level of parallelism can be provided by tools that can break up an applica- 
tion into subtasks and lun them in parallel The subtasks can be procedures, loops, 
or even instructions. The goal of these solutions is to have an application com- 
plete in the minimum time possible, as opposed to those of the coarse-grained 
alternative 

This area of computing is obviously inoie involved then can be covered here. The 
point to be made is that customers are in need of new ways of optimizing their 
use of hardware, and Task Broker can, in its current form, provide a solution task 
Broker can provide parallelism at the application level, which is a maior portion of 
the computational cluster solution 



simultaneously to try to balance the workload is also 
cumbersome, and lends to lead to the same result. The end 
result is increased frustration and decreased productivity. 

Task Broker automates these sen ices, which most developers 
find difficult to manage manually. 

Bidding and Execution 

A machine running Task Broker can act as a client, a server, 
or both. A Task Broker client is a submitter of jobs into the 
compute group, and a Task Broker server is a machine that 
provides services for clients. A single instance of Task Bro- 
ker, called the Task Broker daemon, resides on each client 
and server. 

Each server provides one or more services for the work- 
group, each of which represents a specific compote job. 
Servers can provide any number of services, and services 
can be provided by one or more servers ( which would be 
necessary to load balance the compute group). 

Task Broker Clients and servers interact to distribute mid 
execute jobs in the follow ing manner 

1. A user submits a request for a service to the local Task 
Broker daemon (client daemon). 



2. The client daemon sends a message lo the group of 
servers, requesting bids to service the submitted job, 

:i. The servers COmpiite their bids, or iiffinitii rallies, for the 
requested sen ice. based on their availability to accept the 
job. The bids are returned to the client. 

4. The client waits a preset amount of lime for the seners to 
return their bids and selects the server w ith the highest bid. 

5. The client transmits the necessary files (if necessary) lo 
the selected server, 

(i. The server executes the job according lo instructions in 
the local execution script. 

7. At job completion, the server returns the output files lo the 
client, which are then placed in the user's working directory. 

Since every job submitted to the work group involves bid- 
ding before acceptance by a server, and I he bids can be 
computed dynamically based on the seners availability at 
thai lime, the jobs are automatically serviced by the most 
appropriate machine. A failing machine will automatically 
be avoided by this bidding mechanism, increasing the faull 
tolerance of the group. The basis for the bids or affinity 
values is described later. 

If there are no available seners when bids are requested, or 
If the relumed bids do not exceed a preset threshold because 
the senders are all being heavily used, the job will be put into 
a local queue. The jobs in the local queue will be resubmitt ed 
for bids Tier a preset time limit or by receiving a callback 
from a newly available server. In addition, the job may exe- 
cute locally if the submitting machine can also provide the 
requested service. 

Each daemon maintains a log file that is used lo record 
daemon activity. These can be used to analyze the machine 
use in the work group and can be the basis for fine tuning 
the Task Broker inslallation. 

Task Broker Setup 

Task Broker setup takes place when the product is installed. 
Inslallation anil setup are performed by a Task Broker ad- 
ministrator. The Task Broker administrator is a user with the 
appropriate permissions lo initialize and modify the Task 
Broker inslallation of daemons and setup files. 

When hardware changes are needed in the network the ad- 
ministrator needs to make sure the Task Broker setup files 
are kept current, hi addition, the administrator can make 
changes to the daemon's setup files to fine tune the inslalla- 
tion. To assist the administrator in this analysis. Task Broker 
can GolleC! information about daemon and service activ it v 
through the use of its logging feature or its accounting file. 
Administrator duties are given in reference 1. 

Each machine running a Task Broker daemon needs some 
or all of the following files to operate as either a client or a 
sener: 

Configuration File. This file specifies what sen ices are pro- 
vided, when the senices are available, and who has access 
lo these services. It also specifies how senices are to be 
provided, and under what conditions (see Fig. 1 ). The con- 
tents of a Configuration file are divided into the following 
categories: 
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Fig. 1. overview of Task Broker configuration Gifts. 

Global parameters. These parameters specify changes to 
Task Broker default values thai govern Ihe global conditions 
on the local Task Broker computer. The parameter thai gov- 
erns the wailing period for the task placement process anil 
ihe parameter thai specifies whether to record CPU lime 
used by local tasks are examples of global parameters. 
Class definition. This definition specifies the maximum 
number of services belonging lo a named class Dial can run 
on Ihe local server al one lime. Every serv ice specified musl 
be a member of Ihe specified class. For example, Spice 
might be a member of a class specified as cadtools. 
Client definition This definition specifies (tie servers thai 
can provide service to a client. 

Service definition. This definition specifies items such as 
the local server's ability to provide a particular service, how 
the service will be processed, Ihe affinity value or affinity 
Script, and a list clients thai have access lo Ihe service. 

Service Script. This is a shell scripl thai defines how each 
sen ice being provided by the server is carried out. This scripl 
typically invokes an application thai provides Ihe requested 
service. This scripl is spec ified by the ARGS parameter in Ihe 
service definition portion of ihe configuration file. 



Affinity Script. This script defines ihe algorithm to be used by 
the server to compute Ihe affinity value When a. job is bid on. 
If a constant is used to define Ihe affinity value, this script is 
not needed. 

Submit Script. This script, which is invoked from a Task 
Broker client, submits a service request for a Task Broker 
service. A service request contains infonnalion such as op- 
tional parameters or data files Dial cause the service to be 
run in a specific manner. 

Affinity Value 

The affinity value is an integer from 0 to fl!)!> thai quantifies a 
Task Broker server's ability lo provide a specific service. 
The value may relied the availability of certain Computer 
resources such as disk space or other factors essential to 
perform the service. 

Affinity values can either be hard-coded into the service 
scripl. which resides in each server's configuration file, or 
can be calculated before each bid submittal through the use 
of an affinity script. For example, the following script uses a 
hard-coded affinity value. 
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I Task Broker Service Definition 
| 

Service foo 
CLASS = servicejasks 
MAXJJUMBER = 2 
ALLOW = (12.34.567.*) 
MIN_FREESPACE = 30000 
AFFINITY = 10 



Endservice 

In the above case, when the server daemon receives a service 
request for the foo service, it checks the service definition in 
its configuration file. In this case, the daemon checks several 
parameters for each serv ice request it receives. Some of 
these checks ask I he following questions: 
Is the number of tasks running less than the maximum 

(MAX_NUMBER)'.' 

Is I he requester allowed to run the service here (ALLOW)? 
Are there 30M bytes of free disk space (MIN.FREESPACE)? 

If the answer to ail the above questions is "yes," the server 
daemon sends the affinity value of 10 as its bid for the re- 
quested service. If any of the answers is "no," no bid is 
returned to the requesting client. 

The service definition can also invoke an affinity script as in 
the following example. 

* Task Broker Service Definition 

t 

Service foo 
CLASS = service_tasks 
AFFINITY = 7users/tbroker/lib/foo.aff" 



Endservice 

The shell script foo.aff could possibly include the parameters 
specified in the first example's service definition such as 
MAX_NUMBER, ALLOW, and MIN.FREESPACE. It could also include 
checks on the machine or user submitting the request and 
checks on whether I he data td be accessed is locally resi- 
dent. The result is that depending on the outcome of the 
checks, the script will or will not send an affinity value to a 
requesting client. 

For load balancing to lake place properly, the affinity scripts 
should be identical on even,' computer in the compute 
group. Since the affinity values returned by the server dae- 
mons directly affecl the placement of jobs in a work group, 
proper parameter selection in the affinity scripts is the key 
to optimal server selection. 

Example: A Distributed Make Facility 

This example will show how task Broker can be used to 
create a distributed make facility, enabling compilations to be 
distributed to different workstations on the network so that 
I hey can execute concurrently, resulting in linked binaries 
when everything is completed successfully. The procedure is 
summarized in Fig 2. 

The process begins with the user on Ihe client machine 
creating G program source files ( u in Fig. 2) and placing 
them in the source file directory. At § compiles are initiated 
at the client by executing a makefile, which in turn invokes a 
Submit script (thmake ill this example). The submit script 
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submits a compile request to die local daemon, and at ■ the 
client daemon submits a make6 5_serv request, including infor- 
mation about the directory containing the source files. The 
servers each bid on the compile request and when the se- 
lected server is available ( machine A in this case) its daemon 
accepts the compile request and invokes its local cc service 
script i in Fig. 2). The service script can access the client's 
Tile system via \'FS with Task Broker performing the file 
system mounts if necessary ( • in Fig. 2). 

The client's submit script is written such that it will wait for 
successful completion of all compiles before requesting bids 
for the link service. When the server accepts the link request . 
the compiled code is linked to create an executable program. 
Filially, the file system containing the source files and the 
executable program file is unmounted. 

This example demonstrates several key features of Task 
Broker: 

Multiple instances of existing applications can be executed 
concurrent ly in a work group with very little effort. 
Task Broker provides a flexible way of delaying the exe- 
cution of an application until conditions necessary for its 
execution are in place. In this case, the link operation 
was delayed until the distributed compiles completed 
successfully. 

The sen ice scripts can be written to access remote data via 
mechanisms such as NFS mounts of client file systems. 

Configuration Strategies 

The following two examples of Task Broker configurations 
will demonstrate different philosophies of its use. 

Task Broker with a Mainframe. The first example, which i-. 
shown in Fig. -i, illustrates how a group of Task Broker dae- 
mons can have their sen il es augmented by a mainframe. 



Task Broker and DCE Interoperability 

The HP DCE (Distributed Computing Environmenii developer's environment pro- 
vides a common standards-based framework tor distributed administration, appli 
cation development, and execution m a netwotk of heterogeneous computer 
systems Designed to support the HP 9000 Series 700 and 800 compute' systems 
running the HP-UX 9 0 operating system. HP's DCE developer s environment is an 
implementation of the Open Software Foundation (0SF) DCE developer's services 
with additional tools tor DCE-uased application development 

The DCE core services include security service, remote procedure call IRPCI. direc- 
tory services, time service, and threads Extended services such as the distributee 
file system are also provided 

The current Task Broker already benefits from, and can make use of many of these 
DCE services Since DCE was designed to provide benefits without necessarily 
requiring changes to existing applications, Task Broker can invoke applications 
that explicitly use some DCE services without modifications to Task Broker or the 
applications invoked by Task Broker These DCE services include 
» Remote Procedure Call (RPC) An application written using RPC can be distributed 
in a work group by Task Broker 

• Time Service The host machines in a compute group can use the time seivice to 
keep their clocks synchronized This can greatly simplify the management of a 
Task Broker installation because items such as the Task Broker daemon log files 
will have their time stamps synchronized 

• Directory Services Applications that make use of directory services can be 
managed by Task Broker without restriction. 

• Threads As with RPC, a multithreaded application can be distributed by Task 
Broker without modification 

• Distributed File System This feature is not only compatible with Task Broker, 
but will greatly simplify distributed access in the Task Broker work group 

• Diskless Support Task Broker will operate on diskless machines without 
rnnrlificatinn 

For Task Broker to take advantage of olhei DCE services such as the security 
service will require internal changes to Task Broker 
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Each of the Task Broker daemons acts as a server represent- 
ing a mainframe service in the work group. The bids made 
by Qte daemons indicate the ability of lite mainframe lo lake 
on additional work. 1 'sing Task Broker lo combine a group 
Of workstations with a mainframe in litis way has several 
key advanlages: 

• The mainframe resources can become transparently and 
seamlessly included in the work group without polling any 
of its applications. 

• The workstation users can gain access lo mainframe re- 
sources without machine-specific knowledge, or even any 
knowledge that the mainframe is being accessed in their 
calculations. 

• A Task Broker daemon does not need to be present on every 
host in a work group because a host can have a surrogate 
server in the group acting on its behalf. 

The result in this example is that Task Broker allows overall 
hardware use to increase along with the group's productivity 
with minimal impact on either hardware or software and 
little added expense. 

Flexible Work Group. The second example of a Task Broker 
configuration demonstrates how Task Broker can be used to 
create a flexible work group. During the day the clients 
shown in Fig. 4 access a dedicated server group, and during 
the evening hours, when most users have gone home, some 
of the clients become servers. 

This example makes use of Task Broker's ability to delay the 
submittal or acceptance of jobs until alter a certain time of 



day has passed. This can be done either in the submit script, 
delaying the time when clients request bids for the service, 
or by setting the "liine-of-day" parameter in the affinity 
script, delaying the time when certain server daemons will 
hegin generating bids for any service. 

Using Task Broker to implement this form of flexible 
configuration can Contribute to a group's productivity in 
several ways: 

• Workstation users can access dedicated compute services 
during the day (in this case the server pool) and can have 
their machine automatically added to the server pool after 
work hours. 

• Large jobs requiring a large amount of Compute power can 
be queued to execute after hours to take advantage of the 
increased size of the server [tool. 

• Once the Task Broker work group has been set up as de- 
scribed, no intervention is needed to maintain a flexible 
configuration. If a user wishes to remove a machine from 
the server pool, a quick change to its affinity script is all 
that is necessary. 

These two examples are intended to show that Task Broker 
can be used lo add flexibility to an existing network as 
well as increase access to computer resources that were 
previously inaccessible. 

Task Broker and Other Alternatives 

The strategy behind the Task Broker design is that in most 
cases the user is interested in: 
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• Having a job placed and executed as efficiently as possible 
and not in controlling the placement of a job 

• Distributing tasks at the application level rather than the 
procedure level 

• Having a tool that will require no changes to the application 
to perform its function. 

Job Placement. Although Task Broker provides the user with 
the ability to target specific machines for specialized tasks, 
its primary emphasis is to free I he user from concerns about 
job placement. In environments using scarce resources, 
such as a single supercomputer, there is a similar need for a 
tool to provide a way of preventing users from monopolizing 
that resource. 

For example, suppose some installation has a tool that con- 
trols job queues on a mainframe. In this case, the user sub- 
mits a request to one of several queues along with a set of 
options specifying execution limits, priority, and so on. The 
tool then accepts or rejects the queued job based on re- 
source limits and other factors. If accepted, and there are 
available slots for immediate execution, the tool removes 



the request from the head of the queue and the request is 
serviced. The request will execute concurrently with other 
accepted jobs, based on an administrative limit. 

Task Broker provides a more general solution to this prob- 
lem. It news the entire network of machines as a scarce 
resource, and by load balancing the resources, it prevents 
any one machine in the group from being monopolized, or 
any user from monopolizing too many resources. Thus. 
Task Broker will not forward a job to a server unless one 
is sufficiently available. 

In addition. Task Broker provides mechanisms such as file 
transfers, remote file system mounts, and affinity calculations 
based on configuration that obviate the need for concerns 
about job placement. 

Granularity of Distribution. Task Broker distributes tasks at the 
application level. Alternate strategies of distributed compu- 
tation, such as remote procedure call (RPC), provide remote 
placement at the procedure level. 



HP Task Broker Version 1.1 



The accompanying article describes the features provided in the first version of 
Task Broker. The new version of Task Broker contains all the features contained in 
Version 1 02 and adds the following features: 

A graphical user interface (GUI) has been added to improve the product's ease of 
use. The GUI provides a visual interface Id most of the Task Broker's command set 
and configuration information Fig 1 shows same of Ihe windows provided in this 
new GUI for configuration management. 

Centralized configuration management has been added to allow the entire Task 
Broker installation to be initialized using a single group configuration and to be 



administered from any single machine site What this means is that the data in 
the configuration files described in the accompanying article can be located at one 
machine site. 

i An integrated forms-based configuration editor is provided The configuration 
syntax is simpler and checking is done during the editing session 
> Finally, an online, context-sensitive help subsystem has been added 
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Thr difference represents a tracle-off of computational eon« 

trol versus ease of implementation. RPG requires procedure 
••alls in an application to be replaced by call stubs in an Inter- 
mediate definition language. These Stubs handle the remote 
placement of the actual procedure call. As such. RI'C requires 
customized application source code, most of which must be 
redesigned and reimplemented if not originally implemented 
using RI'C. 

With RP< ' the procedure is usually located on a centralized 
server, or replicated in several places (requiring the servers 
to keep the replicas synchronized). While the server side of 
the application is executing, the client side is not. reflecting 
the synchronous nature of procedure calls. 

In summary. Task Broker is nonintnisive to application 
Source code (satisfying the third user interest above) and 
allows the execution of the applications it distributes to lake 
place concurrently. It does, however, limit the user to remote 
placement at the application level. RFC gives a finer level Of 
computational control, but requires source code changes and 
does not provide a mechanism for concurrent execution. 

Conclusion 

Task Broker can provide many benefits to an organization 
with a network Of computers. Because of its flexibility, Task 
Broker can easily be tailored to provide a simple distributed 



solution to many additional types of situations. As a tool for 
distributing computation tasks. Task Broker can provide a 
way to make existing hardware more efficient by increasing 
its level of use. and software developers more productive by 
providing a way to access an expanded set of computing 
resources. 
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The HP-RT Real-Time Operating 
System 



An operating system that is compatible with the HP-UX* operating system 
through compliance with the POSIX industry standards uses a multi- 
threaded kernel and other mechanisms to provide guaranteed real-time 
response to high-priority operations. 

b) Ke\ in D. Morgan 



HP-HTt is Hewlett-Packard's real-time operating system fttf 
PA-Risc computers. It is a run-time-oriented product (as 
exposed io a program-development -oriented product ) based 
on industry standard soil ware and hardware interfaces. 
HP-KT is intended to he used as a real-time data acquisition 
and system control operating system. It is designed around 
Uie real-lime system principles of determinism (predictable 
behavior), responsiveness, user control, reliability, and fail 
soft operation. These characteristics distinguish a real time 
operating system from a nonrcal-tiine operating system. 
This article reviews some of these characteristics of HP-KT 
and discusses the specific designs used to provide these 
features. 

HP-KT runs on the IIP 9000 Model 742il VMEbus board-level 
computer, which is based on HP's PA-RISC 7100 technology 
(see Fig. I). The 742rt is designed to lit into a VMEbus card 
cage or an HP 9000 Model 717i industrial workstation 
cabinet. ' 

Tin' HP-RT kernel is compatible with the IIP I X operating 
system through compliance with Ihe Following industry 

standards: 

• POSIX (Portable Operating System Interlace) 1003.1, which 
defines a Standard set of programmatic interfaces for basic 
Operating system facilities 

• POSIX 1003.4 draft 0. which defines the standards I'oi 
real-time extensions 

• POSIX 1003.4a draft 4. w hich defines the standards for 
process-level threads. 

HP-RT also supports ('/ANSI G, C+ + , PA-RISC assembly lan- 
guage, and many SVID/HSI) (System V Interlace Definition/ 
Berkeley Software Distribution) commands and functions. 

HP-RT Software 

The HP-RT software is divided into two main categories: the 
HP-KT kernel and the optional IIP KT services (see Fig. 2). 

HP-RT Services. The optional HP-RT services include the 
Following components: 

• Network services including the Network File System (NFS), 
TCP/IP. Berkeley sockets, and ARPA/I5erkcley networking 
Services 

I HP-HI is derived Irom u Ihnd party opeialiinj system called tynnQS Irani lynx Real lime 
System. 1 . Ira: All kernel level dlyonthiirc and data timelines described in this paper aie based 
onLyrwOS luaiures 



• Libraries for developing OSF/Motif graphical user interfaces 
and X clients 

• Development tools to help users create applications to run 
in the HP-KT environment 

• Cross debuggers hosted on an HP I X development work- 
station for debugging the HP-RT kernel or applications 
running on an HP-RT target system. 




(b) 



Kig. t. (id The HP 9000 Model 742n board-level computer 
(bj An HP 9900 Model t-i Ti industrial workstation with ;i Model 
742n loaded in 
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HP-RT Services 




• Development Tools 




• Cross Debuggers 


Application Program 


• Graphical User Interface 




(GUI) Tools 




• TCP/IP. NFS 




HP-RT Kernel 




• File System 




• I/O Drivers 




• Semaphores 




• Memory Management 




• System Clock and Timers 




• Scheduling. Multitasking, Multithreading 


• Interrupt Handling 




• Character I/O 




• Interprocess Communication 




HP 9000 Model 742rt Hardware 



Fig. 2. The HP-RT kernel ami services. 

Kerne) Software. The HI'-RT kernel is designed so Dial il can 
be scaled to balance memory and performance recjuiremeiiis. 
It is small to reduce overhead. The kernel components 
include: 

A count ins semaphore mechanism for process synchroniza- 
tion and lo help ensure atomicity around critical sections Of 
code. 

A system clock that generates time interrupts every 10 
milliseconds. Tims, time events using standard software 



interfaces have a l()-millisecond resolution. For higher 
liming accuracies, drivers and user processes can access 
the hardware littiers on the Model 742rt. These timers have 
1-us resolutions and are 16 and 32 bits wide. 
I/O drivers for Ethernet, SCSI II. RS-212-C. and parallel I/O 
for the Model 742rt compulcr, and guidelines for writing 
\ M Ki his drivers 

Standard operating system services such as: 
Scheduling, multitasking, and multithreading 
Memory management 
Interrupt handling 
Character I/O 

Interprocess communication 

POSIX 1003.1, .4, and .4a kernel services. 

Many of these components are described in more delail later 
in this article. 

HP-RT Development Environment. The development environ- 
ment for HI'-KT is shown in Fig. 3. Programs created lo run 
on the Model 742n in the III'-KT environment are developed 
( using PA-RISC compilers and linkers) on an IIP 9000 Series 
700 or 800 HIM 'X system. The executable programs can be 
downloaded via IAN to a local disk on the target system 
(Model 742rt ). or implicitly downloaded when the program is 
executed via NFS mounting bet ween the HP-RT and HP-UX 
systems. The user can debug the downloaded program from 
the host system via the RS-232-C and IAN connections be- 
tween the two systems. Users can customize the SoftBench 
software development environment- on the development 
host to launch programs to a remote HP-RT system and lo 
launch the correel program debugger for HP-RT program 
debugging. 



HP 9000 Series 700/800 
(HP-UX Operating 
System) 




HP 9000 Model 742rt 
(HP-RT Operating 
System) 



SCSI 



Parallel 1/0 



Printer 



RS-232-C 
(tor Kernel Debugging) 



L 



Serial Terminal 



DE3 



Disk 



DAT 



Fig. 3. The HP-RT development 
environment 
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The items thai come with the HP-RT development toolkit 
include: 

• Libraries for building HP-RT kernels and user programs 

• Include files for compiling user programs and VO drivers for 
executing in an HP-RT operating environment 

• Installation and user program compilation scripts 

• A pair of source-level debuggers: one for user program de- 
bugging and one for I/O driver and kernel-level debugging. 

The two remote debuggers included with the HP-RT develop- 
ment kit are derived from the standard «db debugger product 
provided with the HP-UX operating system, The debugger 
used for user program debugging is capable of debugging 
multithreaded user processes and communicating with the 
target HP-RT system using a TCP (Transmission Control 
Protocol) virtual circuit socket. The kernel debugger is for 
kernel-level and I/O driver debugging and communicates with 
the target HP-RT system via a dedicated RS-232-C serial 
communication link. Using a dedicated communication link 
allows the kernel debugger to operate without interfering 
with the normal operation of the target operating system. 

A set of user commands, a bootable kernel, and miscella- 
neous files are included with the HP-RT system. These items 
can be installed via LAN on a disk connected to the target 
system. The HP-RT kernel can also be booted across a LAN 
and commands and user programs can either reside in RAM 
memory (via a RAM disk facility) or be accessed across the 
network via NFS mount points. The command set on the 
HP-RT target system is oriented around run-time operations 
and system administration. Commands related to program 
development (such as cc and the res and sees tools) are not 
supported and can only be used on the host. 

HP-RT Hardware 

The hardware that supports execution of the HP-RT operat- 
ing system is the HP 9000 Model 742rt VMEbus board com- 
puter shown in Fig. I. This system consumes consumes two 
slots Of a VMEbUS backplane. The system processing unit 
and onboard I/O features of the Model 742rt include: 
PA-RISC' 7100 processor, which has a clock frequency of 
")0 MHz and is capable of executing 01 MIPS 
HM bytes of ECC (error correction code) RAM for main 
memory, which can be upgraded to 04M bytes of ECC RAM 
(The ECC RAM comes in a pair of SIMMs and provides 
single-bit error correction and multiple-bit error detection.) 
04K byte external instruction cache and fi4K-byte external 
data cache 

Onboard I/O ports for one SCSI II interface (up to seven 
devices), two serial RS-232-C interfaces, one parallel 
interface, and one Ethernet IAS interface 
VMEbus D64 interface, which provides an asynchronous, 
32-bit data bus that is capable of transfer rates of up to 
40 Mbytes/s. 

The Real-Time Kernel 

The HP-RT kernel and I/O drivers are designed for real-time 
response and determinism at a level never before accom- 
plished in a Hewlett-Packard operating system product. The 
HP-RT kernel ensures that the highest -priority operations are 
serviced within 50 to 110 microseconds in the worst case and 
typically much faster depending on the specific operation 
To accomplish this, the HP-RT kernel uses a fully reentrant 



and interruptable design and makes extensive use of full 
kernel support for threads for user and kernel processes 

Multithreaded Kernel 

The fundamental unit of an executing task in HP-RT is the 
concept and structure of a thread. A thread contains a pro- 
gram counter (next instruction pointer) and a stack for re- 
cording local subroutine variables and calling sequence pa- 
rameters. Threads do not own a specific address space or a 
specific set of code. Threads typically share address space 
(data area) and code with other threads. The concept of a 
process is simply a combination of a single thread, a code 
segment, and a data area (see Fig. 4a). HP-RT extends this 
concept by allowing a single process to create multiple 
threads (see Fig. 4b). These additional threads execute code 
in the same process code area and have identical access 
rights to all data areas in the process. See "An Overview of 
Threads." on page 27 for a brief tutorial on threads. 

HP-RT also implements the concept of a kernel thread. A 
kernel thread is a thread of execution that only executes 
kernel code at a kernel privilege level. Kernel threads are 
used in HP-RT to provide kernel services asynchronously 
for any specific user process or thread with each service 
executing at a user-specified priority. 

Reentrancy and Interruptability 

The HP-RT kernel's general model is to execute on behalf of 
a thread of execution with Interrupts enabled and context 
switching allowed. The specific thread executing may be a 
thread associated with a user process or a kernel thread. All 
threads, regardless of type, have their own user-specified 
priority, scheduling policy (time-sliced versus run-to- 
completion), and system level. 

The system level is a specification of the mode in which a 
thread is executing. At system level zero, a thread runs in 
user mode, with user-level privileges. Kernel threads by defi- 
nition never use this system level. At system level one, a 
thread executes kernel code with kernel-level privileges and 
with all interrupts enabled and context switching allowed, 
At system level two, a thread executes kernel code with 
context switching disabled, but interrupts enabled. Finally, 
at system level three, a thread executes kernel-level code 
with both context switching and interrupts disabled. Table I 
summarizes these system levels and execution modes. 

Context switching and interrupt handling In HP-RT are 
described in more detail in the article on page 33i 
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Fig. 4. Thread COnflgtlliations, (a) A typical single-thread |>ro<ess 
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Table I 

System Levels and Execution Modes 



System 


Execution 


Context 


Interrupts 


Level 


Mode 


Switching 




Zero 


User 


Allowed 


Enabled 


One 


Kernel 


Allowed 


Enabled 


Two 


Kernel 


Disallowed 


Enabled 


Three 


Kernel 


Disallowed 


Disabled 



The HP-RT system supports one tionthrcad mode of execu- 
tion, which is based on execution using a single interrupt 
stack. However, unlike timesharing systems and many real- 
time systems, HP-RT makes very limited use of interrupt- 
stack-based execution because this mode of execution is 
always at a higher priority than thread execution. Execution 
using an interrupt stack means that a full thread context is 
not established, which means that a context switch to a 
thread cannot be allowed until the interrupt -stack-based 
execution is complete. Most interrupt service routines, such 
as the handlers for the SCSI bus and LAN interrupts, are 
instead handled by a specific kernel thread. These threads 
are scheduled w hen then corresponding interrupt occurs at 
their specific priority and are not executed until all higher- 
priority thread execution is complete. 

Because of the general reent rancy of HP-KT, explicit c alls 
are used in kernel code and I/O drivers for managing reen- 
traney.t The macros sdisabled, srestorel). disabled, and restorel) 
are used to move a process to system levels two (context 
switch disabled) or three (both context switching and inter- 
rupts disabled) and back to the premove system level. Turn- 
ing context switching off guarantees atomicity with respect 
to the execution of other threads. Turning off interrupts guar- 
antees atomicity with respect to execution of both threads 
and interrupt -slack-based handlers. 

Data structures used by the kernel are generally global to the 
entire kernel and nonreenlrant operations must be properly 
protected. A simple example of this is the use_count field of 
the in-core inodett data structure. The use. count field indi- 
cates the number of instances of a particular file thai are 
active (e.g., open ). When a new process accesses an inode. the 
equivalent of the code statement inode_ptr->in_use4+ (incre- 
ment use_count) must be executed, (in PA-RISC (and most 
RISC processors), this code translates to a sequence of in- 
structions that loads the use_count value, increments it, and 
then stores the value to the memory location it c;une from. 
Interleaving such operations, which can easily happen be- 
cause of a context switch from one thread to another, will 
cause the use_count to miss an increment, producing devas- 
tating long-term results. 

For example, Fig. 5 shows what can happen when a thread 
is interrupted before finishing incrementing the use_count 
field for a particular mode. The use count field is represented 

T A reentrant process consists ot logicallv separate code anil data segments ana a pnvaie slack 
Multiple instances ol a leentram process can share the same code segment but each instance 
has its own data segment and slack 

1 1 An mode is the internal representation ol a tile in a UNIX' system-based operating system 
An m core mode is one thai resides id mam memory 



by the variable X, which is initially equal to one (i.e., some 
Othei thread 01 process is accessing the same file). Al ! 
Thread 1 begins executing the instructions to increment X, 
but .just before storing the result in X, Thread 2 interrupts al 
6 and the schedulei hands control over to Thread 2. Thread 
2 increments the same use_count field When Thread 2 is fin- 
ished. X = 2 and the scheduler returns control back to 
Thread 1 at • Al i Thread 1 finishes its work on the 
use count field by storing the value il computed before being 
interrupted Into X. At this point X should be equal to three, 
bul because Thread 1 was interrupted before it finished its 
critical section, X - 2. 

The need for atomic increment and decrement operations 
is so pervasive in the HP-RT kernel that special macros 
called ATOMIC JNCO and ATOMIC_DEC() are used. These macros 
generate inline assembly code that disables interrupts, per- 
forms the increment or decrement operation, and reenables 
interrupts. 

Use of an interrupt disable versus a context switch disable 
is a key design decision for every critical section of 1 IP RT 
kernel code. The main question asked in arriving at a deci- 
sion is whether the operation is critical relative to execution 
of code that can run on the interrupt stack. Since very little 
code in HP UT executes on the interrupt stack, a context 
switch disable usually suffices for protection. However, a 
context switch disable is a more expensive operation than 
an interrupt disable operation. A context switc h requires 
memory acc ess and an interrupt disable only requires exe- 
cution of an inline assembly statement which turns off the 
interrupt enable bit in the PA-RISC processor status word. 
Thus, very short Operations are better protected with 
interrupt disables. 

This raises the question of how HP-RT solves the problem of 
long critic al sections for which a context switch or an inter- 
rupt disable last loo long. In the analysis of customer re- 
quirements and competitive systems, it was determined that 
Context switch off times should be held to as close to 100 
microseconds as possible, and ideally less, and interrupt 
disables should be held as close to 50 microseconds as pos- 
sible. ;uid ideally less. Longer critical sections are managed 
using kernel-level semaphores. 

0x-, 



IHUSB++ 




Thread 1 Thread 2 



X = use coum Field in mode Dala Structure 
rio ill = Registers 

Fig. 5. What ran happen when a thread is context switched in the 
middle of a critical operation. Thread l is Interrupted and context 
Switched just before it is about to increment the use.count value. As a 
result, when Thread l is finally able to fuush its operation, the wrong 
value is stored in use_count. 
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An Overview of Threads 



When a process is running it executes a sequence of instructions stoted in its 
address space in memory This execution of a sequence of instructions is called a 
thread of execution, or simply a thread The execution of a thread requires that it 
have its own program counter to point to the next instruction in the sequence, 
some registers to hold variables, and a stack to keep track of local variables and 
procedure call information Although threads have some of the same characteris- 
tics as a regular process, they are sometimes called a 'lightweight" process be- 
cause they don't cany around the overhead (or extra weightl of regular processes 
Table I lists some typical items associated with each thread and each process 

Fig. 1 models processes and threads running in a computer The processes m Fig 
la have one thread of execution each They also have their own address spaces 
making them independent of each other To communicate with each other (for 
example, to share resources! they must do so through the system's interprocess 
communication primitives, such as semaphores, monitors, or messages. In Fig lb 
the three threads are in one process Thus they share the same address space and 
have access to all the per-process items listed in Table I. 

One of the reasons threads were invented was to provide a degree of quasiparallel 
execution to be combined with sequential execution and blocking system calls For 
example, consider a file server that must block occasionally to wait for the disk In 
a single-process situation the server would get a request and service it to comple- 
tion before moving on to the next request Thus, no other requests would be serv- 
iced while the server is waiting on the disk If the machine is a dedicated file 
server, the CPU is also idle while the server process is waiting on the disk 



Table I 

Items Associated with Threads and Processes 





Per-Thread Items* 

Program counter 

Stack 

Registers 



Per-Process Items 

Address space 
Global variables 
Files 

Child processes 

Signals 

Semaphores 



* All per-thread items are also per-process items 

If the server is a multithreaded process, one thread could be responsible for read- 
ing and examining incoming requests and then passing the request to a thread 
that will do the work Wlien a thread must block waiting on the disk, the schedul- 
ing thread can get another request and invoke another thread to run The result of 
using threads in this case would be higher throughput because the CPU would not 
sit idle, and better performance because it is much faster to switch threads than 
to switch processes. 

In a real-time system where a quick response to interrupts and other events is 
critical, threads offer some definite advantages, especially if one considers 
context switching between processes versus switching between threads. Table II 
summarizes some of the main differences between threads and processes. 

Table II 

Differences between Threads and Processes 



Processes 

Program-sized 

Context switch may be slower 

Difficult to share data 

Owns resources such as files and memory 



Threads 

Function-sized 

Context switch may be faster 

Easy to share data 

Owns stack space and registers 
only 
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Fig. I. Models ol processes and threads lunning in a cuinpulei laj Multiple processes 
|b) Multiple threads in one process. 



Kernel Semaphores and Priority Inheritance 

An example of an extended critical section is the manipula- 
tion of an in-core mode. Critical mode operations such as the 
addition of a file to the directory data of a director]! mode 
musl be performed atomically. Each mode holds a sema- 
phore which is locked and unlocked around these critical 
operations. 



The HP RT kernel uses the simple semaphore primitives 
swaitll and ssignalll (corresponding to Dijkstra's P and V op- 
erations for process synchronization, miilnal exclusion, 
and atomic resource management. A single :)2-bil integer is 
used as a kernel semaphore dala structure. This data struc- 
ture supports I wo semaphore types: counting semaphores 
and priority-inheritance semaphores. With an additional 
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Locked Semaphore 



Linked Lisl ol 
Wailing Threads 
in Priority Order 



Fig. 6. A lockcrl reuniting semap] inn • .mil wailing ilircnds. 

level of lock and unlock code and using a separate integer an 
a counter, priority-inheritance semaphores can also be used 
as the basis for counting semaphores. Priority-inheritance 
semaphores are described later in this paper. 

The semaphore primitives ssignal and swait have die code lo 
interpret the contents of the kernel semaphore data struc- 
ture and are able to differentiate between counting and 
priority-inheritance semaphores. 

A counting semaphore in HP-RT holds a positive count value 
when the semaphore is unlocked and a resource is available. 
An swaitO operation on a positive-valued semaphore causes 
the semaphore to be atomicaUy decremented, and the calling 
thread continues execution. An swaitO on a zero or negative- 
valued semaphore (the resource is not available) causes the 
thread to block (suspend) on the semaphore. 

When one or more threads are blocked on a counting sema- 
phore, the threads are placed into a priority-ordered linked 
list with the semaphore beading the list. To identify a sema- 
phore that is locked and has one or more waiting threads, 
the semaphore is set to the negative address of the first 
waiting thread (see Fig. t>). The sem and owner fields shown 
in Fig. (i are described below. 

An ssignall) on an unlocked or locked-with-no-waiters count- 
ing semaphore merely causes the nonnegative value of the 
semaphore lo be atomically incremented. An ssignalll on a 
locked semaphore with one or more waiters (one that holds 
a negative thread structure address] causes the first (highest- 
priority ) waiting process to be unlinked and scheduled. 
Table II summarizes the different states of HP-RT counting 
semaphores. 



Table II 

Different States of Counting Semaphores 
State Meaning 

0 Locked with no waiters 

-Address Locked with waiters (The address 

points in the first thread in the list of 
waiting threads.) 

> 1 Unlocked 

One drawback of this semaphore methodology is that there 
is no clear ownership of a locked semaphore. The second 
drawback is the risk of priority inversion. 

Priority Inversion 

In most real-time operating systems, a priority-driven pre- 
emptive scheduling approach is used. This scheduling 
method works well w hen a higher-priority process (or 
thread) can preempt a lower-priority process with no delays. 
One important problem that sometimes hampers the effec- 
tiveness of this scheduling algorithm is the problem of 
blocking caused by the synchronization of processes that 
share physical or logical resources. 

The most common situation occurs when two processes 
attempt to access shared data. In a normal situation, if the 
higher-priori I. v process gains access to the resource first, 
then good priority order is maintained. However, if a higher- 
priority process tries to gain access to a shared resource 
altera lower-priority process has already gained access in 
the resource, then a priority inversion condition lakes place 
because the higher-priority process is required lo wait for a 
lower-priority process to complete. 

The following example, which is loosely based on an example 
first described by Lampson and Redell. 1 shows how a prior- 
ity inversion can occur. Although the term process is used in 
the following example, the executing entity could just as 
well be a thread. 

Let PI, 1*2, and Pi be three processes arranged in descending 
order of priority. Let processes PI and Pi share a common 
data Structure which is guarded by the binary semaphore X. 
Fig. 7 and the following sequence shows the events that can 
lead to a priority inversion: 

1. P-'t locks X mid enters its critical section. 

2. PI arrives, preempts P.! and begins its processing. 

• f. PI tries to lock X. but because X belongs to Pii, PI is 
blocked. 

1. P.! again attempts lo finish its critical sec tion. 

5. P2 arrives and preempts P:J before it finishes its critical 
section. 

li. Assuming there are no more preemptions at some point 
P2 finishes, then P3 finishes, and PI finally is unblocked on 
resource X and allowed lo finish its critical section. 

hi this scenario the duration of Pi's bloc king is unpredic table 
because other processes can show up before- P3 finishes its 
critic al section and is able to release X. 
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Priority Inheritance 

Tin' methodology used in IIP-RT to avoid the priority inver- 
sion problem employs priority-inheritance semaphores. The 
hasic concept of priority-inheritance semaphores is that 
when process I' blocks a higher-priority process, it executes 
its critical section at the highest priority level of all of the 
blocked jobs. Process P returns to its original priority level 
when it completes its critical section, which then allows the 
highest-priority blocked process to execute. 

From the example above if PI is blocked by P3 then accord- 
ing to the priority-inheritance concept, P3 inherits the same 
priority as PI while it executes in its critical section. When 
process P2 arrives (while P3 is in its critical section) it 
would not be able to preempt process P3 because I»:5 would 
be running at a higher priority than P2. Thus, process P2 will 
not begirt execution. When P3 nnishes its critical sect ion, 
process PI can preempt P3 and run to completion. Then 
process P2 can begin execution. 

Priority-inheritance semaphores can become unite complex 
when nested semaphore locks are allowed as they are in the 
IIP-RT kernel. Not only must the current owner ami all wait- 
ers for a semaphore be known, but given the owner of a 
particular semaphore, the highest -priority waiters of all 
semaphores currently owned by that owner must be known. 
This allows the system to manipulate priority properly as 
semaphores are released. The priority must revert to the 
priority of the current highest-priority waiter of all still-owned 
semaphores. 

To manage this complexity and yet retain a single interface 
and data structure for semaphore operations. IIP-RT uses 
the semaphore value -1 to indicate unlocked for a priority- 
inheritance semaphore. A value of one is not a possible 
thread structure address, so this value cannot be confused 
with the negative address of the first waiter of a counting 
semaphore. 

Two fields in the thread structure are used to differentiate 
between the various stales of priority-inheritance and count- 
ing semaphores when they are locked. A counting semaphore 
that is locked and has waiters will have the sem field in the 
first waiter thread holding the address of the semaphore 
and an owner field containing zero (see Pig, (i). A priority- 
inherilance semaphore thai is locked and has no waiters will 
hold the negative address of the owner thread, which has a 
sem field with a value of zero (see Fig. 8a). lastly, a locked 



priority-inheritance semaphore that has waiters will hold the 
negative address of the highest-priority wailing thread. This 
thread structure has a sem field holding the address of the 
semaphore and an owner field holding the address of the 
owning thread (see Fig. 8b). 

To keep track of the highest -priority waiters for all owned 
priority-inheritance semaphores, a doubly linked list contain- 
ing the highest-priority waiters for each owned semaphore is 
attached to the thread structure of each semaphore owner. 

Owner 
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Fig. h. i iiit.'i structures associated with priority inheritance sema- 
phores, (a) i\ kicked semaphore with no waiting threads. <I>| A 
locked semaphore with waiting threads 
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The different stales of priority-inheritance semaphores are 
summarized in Table ill. 



Table III 

Different States of Priority-Inheritance Semaphores 
State Meaning 

- 1 Unlocked 

-Address of thread owner Locked without waiters 

(sem field in thread 
owner = 0) 

-Address of highest -priority Locked with waiters 
wailing thread (sem field in 
highest-priority waiting thread 
= semaphore address and owner 
field = thread owner address) 



Process Scheduling 

HP-RT currently uses M distinct priority levels with the abil- 
ity to extend support to 1024 distinct priority levels. Half Of 
all 1IP-RT priorities are reserved for use by kernel manage- 
ment software. There is no explicit user program interface 
provided for placing a priority at these reserved levels. The 
reserved priorities are interleaved with the user priorities 
and are considered a "priority boost" on a user priority. 
Thus, between any two user priorities N and N + 1 lies a 
priority N + boost, which is more important than priority N 
and less important than priority N + 1. Boosted priorities are 
used by kernel service threads to provide service just above 
the priority of the highest -priority requesting process, but 
not at the next highest user priority which may be in use by 
the system user. Priority boosting is also used for temporary' 
elevation of the priority of processes blocking on I/O opera- 
tions to maximize throughput. This type of algorithm is only 
used in a user-specified portion of the overall priority range. 

The HP-RT kernel internally manages priorities by convert- 
ing from the user priority plus a possible boost value to a 
nin queue table index by using the formula: 



Internal Priority = (user priority) x 2 + boost, 

where boost is either zero or one. Hence, if user priorities 
range from zero to 127, the internal priorities range from 
zero to 255; 

HP-RT maintains a run-queue table with one entry per inter- 
nal priority. Kach entry holds a ready thread list head and a 
list tail pointer (see Fig. !)). To determine quickly the highest 
priority for which there is a nmnable thread. HP-RT uses a 
two-level bit mask called a ready mask in which a set bit 
indicates a nmnable thread. The top level of the ready mask 
is one .'12-bit word. Each bit in this word indicates that within 
a set of 32 priorities, at least one thread is executable. Thus, 
if as shown in Fig. 9 the high-order bit of the first word of 
the ready mask is set, then there is at least one thread in the 
internal priority range of 1023 to !1!»2 that is executable. The 
second level of the ready mask holds up to 32 32-bit entries 
each of which indicates which of these 32 priorities holds 
executable threads. 

By using high-speed assembly language code to find the first 
set bit in the ready mask, the highest -priority thread in the 
nonempty run queue can be quickly determined. 
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Managing PA-RISC Machines for 
Real-Time Systems 



In the HP-RT operating system, the interrupt-handling architecture 
is especially constructed to manage the high-performance timing 
requirements of real-time systems. 

by George A. Anzinger 



The task of an operating system is to manage the computer 
system's resources. This management should he done so as 
to give tile best possible performance to user tasks or jobs 
presented to the system. How this performance is measured 
and valued differs depending on the task or mission of the 
system. The three major classes of tasks or missions pre- 
sented to an operating system are timeshare. batch, and real 
time. The important aspects of performance of these three 
classifications differ, and, because they differ, require the 
operating system lo use different algorithms to manage 
system resources. 

Timeshare 

Timeshare systems are usually designed lo share system 
resources with all contending processes. The major resource 
to be shared is CPU time, which is usually sliced into small 
units (called time slices) and allocated lo all ninnable pro- 
cesses in a "fair" way. Various notions of fair exisl and have 
been used, but in general, ninnable processes contend at (he 
same level or priority for GPU time. Some (or even most) 
systems modify this notion of fair lo give more time lo a 
process thai blocks often and less to a process that is com- 
pute bound. Some systems may also have preferred priori- 
lies for processes I hal run on behalf of the system. Such 
processes may be handling printers, communication lines, 
or Othei things thai are shared with several processes. 

Batch 

BatCll systems are usually designed lo maximize the Ihrough- 
pUI of i he system. Thai is lo say. they attempt logei the 
most work done in a given period of time. Such systems will 
not usually use a timeshare scheduling algorithm because it 
introduces overhead that does not add lo the desired re- 
sult — throughput. To help achieve maximum throughput, 
one popular batch scheduling algorithm is to run t he job that 
has the least amount of tone left to run The point is that 
batch systems typically do not need lo make any aitempt to 
share CPU time. 

Real Time 

Real-time systems, unlike timeshare or batch systems, are 
usually designed to ran Hie most important process that is 
ready. Importance is assigned by the user or designer of I he 
system, and the operating system has little or nothing to say 
aboul it. The system designer (i.e., the user who sets up the 
System) decides the order of process importance and assigns 



priorities for all processes on the system. The operating sys- 
tem's job then is very simple: give the < 'PI' to the highest- 
priority process that is ready. Hie performance of a real-time 
system is usually measured by how fast it can respond to 
events that change the identity of the highest-priority ready 
process. Such events are usually external and come to the 
system in the form of interrupts, but can also be internal in 
the form of processes that promote other processes to higher 
priorities (or demote themselves to lower priorities). Another 
major event that real-time systems must respond to Is the 
passage of time. The indication of the passage of time also 
comes to the system in the form of an external interrupt. 

From Ibis discussion, it is apparent that one major measure 
of a real-time system is how quickly it can respond to an 
interrupt. A response consists of: 

• Recognizing thai the Interrupt is pending 

• Processing the interrupt (i.e.. deciding what to do about it ) 

• Taking the indicated aclion. 

I sually the indicated aclion will be lo switch context to the 
process that is to handle the interrupt. Context switching 
encompasses the actions taken when Control or execution 
moves from one process lo another as a result of an inter- 
rupt or some other event (see "Context Switching in IIP KT" 
on page 32 for more about context switching). 

From a system's point of view Ihe response (or response 
lime) is the time it takes the whole sysleint lo do something 
that changes the environment it is monitoring or controlling. 
From an operating system vendor's point of view the re- 
sponse stops when the user code gets control and the Oper- 
ating system's responsiveness is no longer key lo system 
performance. 

While the system is dealing with one interrupt and preparing 
a response, it may need to contend with Other interrupts that 
are less urgent. The system must take Ihe lime lo determine 
this. 

It is also possible that, at the lime an Interrupt arrives, the 
system is in a state in which the interrupt system or context 
switching is off. The system needs lo go into these slates lo 
protect shared data from Corruption by contending processes 
(sec "Protecting Shared Data Structures," on page :l;!). Some 
systems protect themselves and I heir shared data by lurning 
off context switching whenever they are in system code. 

t this includes ihe operating system, the user application, and Ihe external devices 
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Context Switching in HP-RT 

Context switching can be defined as moving abruptly from one area ol code to 
another as the direct result of some influence outside of the program or programs 
being switched to or from Usually the context switch is the direct result of an 
interrupt or trap la trap is an internal interrupt caused by some program activity 
such as divide by zero or illegal memory accessl A context switch can also occur 
as a result of a program or thread blocking. In this case the operating system will 
context switch to a program or thread that is not blocked These two different 
ways of generating a context switch have different overhead costs as will be 
explained below One of the measures ol a real-time system is how fast it context 
switches When used in this way the reference is to how fast one user process 
can be suspended and another user process restarted 

To context switch, the operating system must save the trom process's state The 
state consists of all the machine registers that the program may depend on After 
saving the from process state, the to process's state must be restored As a result 
of this save and restore, both the to and from processes have then view of the 
world preserved and restored respectively even if they are suspended for a very 
long time. 

For example, consider the case of a user program that has asked tor some device 
input The program will be suspended or blocked on the device driver waiting for 
the device to respond with the desired data While waiting, the operating system 
will find some other program that is ready to run and switch to it When the de- 
sired data arrives, the processor will be interrupted and the operating system will 
switch control of the processor to the waiting program 

As an example ol a context switch that is strictly the result of an external inter- 
rupt, consider the case in which a time slice is exhausted. In this case, both the 
program being switched from and the one being switched to are interrupted as 
opposed to having to block and wait for a resource. 

From a system overhead point of view there are four different types of context 
switch 

Both the from and the to processes enter the blocked state programmatically 
The from process blocks programmatically and the to process is interrupted 
The from process is interrupted and the to process is blocked programmatically 
Both processes are interrupted 

Because of calling sequence conventions, processes that are interrupted incur 
additional overhead to save and restore caller registers 

To take advantage of the savings possible when processes block programmatically, 
HP-RT uses a context switch routine based on this type of block The extra work 
required when processes are interrupted is performed by code in the system 
interrupt handler 



This is not reasonable for a high performance real-time 
system thai is frying to switch contexts in less than 50 us. 
For Qtese systems it is necessary to recognize and process 
interrupts in the 2">-us range. This implies that the interrupt 
off time plus the interrupt processing time must he kept 
below 2-" us. 

This paper will explore the problems a PA-RISC architecture 
presents to feal-time processing. These problems revolve 
around the need for fast context switching, interrupt han- 
dling, and repeatability Next, possible solutions to these 
problems will be discussed, detailing the solutions used in 
the HP-RT (real-time) operating system, which runs on the 
HP noon Mud, 'I 742n \ MKbus board computes The hardware 
and software components of the Model 742rt are described 
in the article on page 23. 



PA-RISC' Architecture 

The RISC architect lire is used to speed up CP1 T s by design- 
ing them so that each instruction is simple and can be 
executed quickly. The goal is usually to have each instruc- 
tion take the same amount of lime to execute and to design 
the machine so that several instructions can be pipelined. To 
gel all instructions to execute in the same time requires that 
no one instruction can be complex. Operations that are com- 
plex and require more than one instruction time are either 
handled by subroutines or by coprocessors. Coprocessors 
are designed to run independently allowing the main proces- 
sor to do other useful work while the coprocessor does its 
work. For example, HP's PA-RISC' machines use coproces- 
sors to do floating-point math. 

In Ill's PA-RISC processors, the following characteristics 
are tmpOrtaitl for real-time applications: 

• Memory reference instructions either load or store and do 
nothing else. This means that there is no read-modify-write 
instruction. 

• Memory reference instructions may stall if the data is not 
available. To help in this regard, a cache memory is used lo 
speed up the average access to memory. 

• Since memory accesses are potential roadblocks, 152 general- 
purpose registers are available as well as 27 control registers 
and 32 04-bit-wide floating-point registers. This allows the 
processor to keep most of the variables of interest in 
registers, avoiding slow memory access operations. 

• All interrupt context is kepi in control registers. 

Real Time and HP's PA-RISC 

From a real-time perspective, the characteristics of HP's 
PA-RISC that are of concern are those that limit performance 
in the real-time sense. As discussed above, a real-time system 
must be able to change its mind (context switch) quickly, 
litis implies that the large context associated with a process 
can be a problem. Also, while changing conlexl, as well as 
doing other things, the system needs to be even more re- 
sponsive lo interrupts. This means we must not turn the 
interrupt system off for long limes. In particular, we must 
not turn it off for the duration of a context switch. 

I IP-RT is the result of polling a third-party operating system! 
to the IIP 110(11) Model 7-I2H board level real-time computer. 

As such, the porting team was constrained to work with the 
conventions existing in the system being ported. Likewise, 
the porting team was not empowered to change any of the 
language or hardware conventions that exist in HP's PA-RISC 
machines and the HP-UX* host operating system. 

To lake advantage of the best of HP's PA-RISC processors, 
the poll team decided to restrict the system lo PA-RISC 1.1 
architectures. The 1.1 architecture provides shadow registers 
thai allow system inlernipl code to be run without saving 
any context (see "The Shadow Register Environment," on 
page 34). 

On examining the way the system we were polling recom- 
mends thai drivers be written we found the following: 

t lynxOS (fas lynx Real-Time Systems Inc 
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• After an interrupt, the system enters the interrupt service 
routine. The routine should be written in C and should make 
a call to the operating system function ssignal and then return. 

• Tlie function ssignal increments a counting semaphore, and 
if the result is 0. the interrupt service thread is put in the 
ready list (execution threads and counting semaphores used 
in the HP-RT operating system are described in the article 
on page 23). 

• If the new entry in the ready list has a higher priority than 
the current process, a flag is set indicating that a context 
switch is needed. (Context cannot be switched while in an 
interrupt handler.) 

• When the drivers interrupt service routine returns, the 
system notices whether a context switch is pending and if 
so takes the required ac tion. If not. the system just returns 
to the point of the interrupt. 

The problem with this picture is that to call the interrupt 
service routine the system has to save most of the system 
state. This is a lot of overhead for only one function call 
and return. 

The team decided that a better way to handle interrupt 
servicing would be to code a companion ssignal function. The 
new ssignal runs using only the shadow registers and still 
does everything the original ssignal did. This scheme allows 
the whole ssignal call to be made without establishing a C 
context, which involves saving and restoring the C environ- 
ment (see "C Environment." on page 35). However, some 
restrictions are placed on I/O drivers in that they have to 
make their semaphores known to the operating system. 

In some cases, calling the ssignal function is almost all that 
an interrupt service routine will do. It is also possible that a 
few lines of assembly code might be required to complet e 
the interrupt service routine. Such code might move a byte 
of incoming data from the VI ) device to an internal buffer. 
For applications that have these kinds of interrupts, the sys- 
tem provides the ability to call ;ui assembly language inter- 
rupt service routine. To keep overhead low, tin- assembly 
language interrupt routine is restricted lo using the shadow 
registers and no system resources. The system interrupt 
dispatcher calls (he ssignal function if the assembly language 
routine returns a nonzero semaphore. 

Some I/O devices and drivers require full ('-code interrupt 
handlers, For these interrupts, the system establishes aC 
context on an interrupt control slack. In this context inter- 
rupts of higher priority are tamed on while Ihe interrupt is 
processed. These routines can also call a limited number of 
System functions. For example. Ihe system lime base inter- 
rupt is handled by a C interrupt handler. 

Willi three different possible interrupt handling situations, 
the operating system needs to have Ihe ability to decide 
quickly which interrupt service routine to use. Usually this 
is done by either a table index, in which the system deter- 
mines the method to use via a number thai is an index into a 
table of routines to call, or a case statement, in which the 
indicated method, again expressed as a number, is used lo 
indicate which code to execute. A much quicker method 
than these Iwo is to put Ihe address of Ihe interrupt service 
routine in Ihe driver's table structure. This also allows the 
system to be expanded easily lo handle other interrupt 
handler environments. 



Protecting Shared Data Structures 

Shared data structures are needed in any operating system to keep track of the 
resources that the system is sharing among several processes For example, each 
process will need memory for its code and data This memory is a shared resource 
and the management structures must be accessed in a way that will not allow the 
system to lose parts of the resource One method of keeping track of a resource 
like memory is to keep free pages of memory in a free list When a page of mem- 
ory is needed, the page at the head of the free list is removed from the list and 
given to the requesting process This removal land its subsequent return) must be 
done in an atomic operation with respect to the contending processes 8y this we 
mean that, as far as any process cares, the removal of a page from the free list 
happens as one indivisible operation. Otherwise, a contending process could get 
control and possibly get the same page 

The importance of maintaining atomicity in dealing with a shared resource such 
as memory on a free list is illustrated in the following example. The process of 
removing page A from the free list involves 

1 Picking up the pointer to page A frnm the list head 

? Using the resulting pointer to get the pointer to page B. which is in the first 
word nf page A 

3. Storing the pointer to page B in the list head 

If the removal is interrupted after step I but before step 3, and the interrupting 
process also tries to remove a page from the free list, both processes will get the 
same page and most likely the system will fail Similar problems on returning of 
pages to the free list can result in Inst pages ur even circular free lists 

Hie solution to these problems is to make a sensitive operation atomic with respect 
to contenders. If only processes can contend, it is sufficient to prevent context 
switches fur these periods of time If one or more of the contenders runs on an 
interrupt, then interrupts must be disabled to achieve the required atomic operation 

The HP-RT system supports three levels of contention protection: 

• Interrupts disabled 

• Context switch disabled 

• Semaphore locking. 

From an overhead point of view, the cost is lowest for the interrupt disable and 
highest for the semaphore lock From an impact on performance point of view, 
interrupts should be disabled only fur short periods of time, context switch dis- 
abled only for slightly longer tunes, and semaphores held as long as needed 

Foi shorl operations, such as the list removal operation described above, the 
interrupt disable method is the best to use (even if the atomic test does not require 
this level of protection) because the disable time is short and the overhead of 
interrupt disable protection is the lowest of the three methods. 



A New Interrupt Environment 

The need to deal with Ihe three interrupt handling situations 
described above and the requirement to handle interrupts 
from the VMEbus meant that we had to design and imple- 
ment a new interrupt handling environment. Fig. 1 shows a 
simplified view of the logical I/O architecture thai the HP-RT 
interrupt handling subsystem is designed to service. 

The nature of the VMEbus requires a second level of inter- 
rupt dispatch. This is necessary because VMEbus interrupts 
come into Ihe PA-RISC' processor via one of seven lines or 
PA-RISC interrupt levels. As shown in Fig. 1, each of these 
lines can handle several independent devices, which implies 
several interrupts. 

The VMEbus standard specifies that a device requesting an 

interrupt must assert its request on the interrupt line il i* 



Aufiusi v.m lii'wlcit-l'iu'kiirfl Jaatnd 33 

l Copr. 1949-1998 Hewlett-Packard Co. 



VMEbus 




Up to 32 
—} I/O 
Groups 



Up lo 7 
I/O Device 
Groups 



VMEbus Oevices 

Fig. 1. A logical view of I he I/O architecture the HP-BT Operating 
system is designed to work with. 

using. The interrupt rcsponder sees the request and sends 
back an interrupt acknowledgment for that interrupt line. 
Bach device using the same line blocks the acknowledgment 
signal from being seen by devices farther away from siol III 
while il has an interrupt request pending. When a device 
with an inlemipt pending sees an interrupt acknowledge il 
responds by sending back an inlerrupl vector. The interrupi 
vector is a data element (byte or word) I hat identifies (he 
interrupting device and is used by the interrupt responder to 
dispatch the interrupt. 

The original plan for I he Model 7 I2n hardw are was to inter- 
rupi the PA-RISC processor when a VMEbus interrupt re- 
quest was asserted and to do the interrupt acknowledgment 
when the processor at templed to read the interrupi vector. 
This plan required the operating system to stall in the inter- 
rupt handler with the interrupt system off for an unspecified 
length of time because VMEbus devices are not required to 
yield the bus to a requester, making the actual lime required 
to do an operation on the bus open-ended. To solve this 
problem, the HP-RT team decided thai the interrupt vector 
should he prefetched by the hardware before interrupting 
the PA-RISC processor. In this way a VMEbus interrupt can 

I Slut D in a VMEbus caidcaye typically houses the card or cards that contain the VMEbus 
system controller and other resources 



The Shadow Register Environment 

The PA-RISC I I implementation added shadow registers lo the basin machine 
architecture. Shadow registers are seven registers into which ihe contents of GRs 
(general registersl 1 . 8, 9. 16, 1 7. 24. and 25 are copied upon interruption The 
contents ol these general registers are restored from their shadow registers when 
an RFIR (return from interruption and restore) instruction is executed 

The shadow register environment includes code that executes between a proces- 
sor interrupt and the following RFIR instruction This code is executed in HP-RT 
using only the shadow registers. It is important to note that the nature of this 
environment is further defined by the nature of the processor's behavior on inter- 
rupi When an interrupt occurs the processor transfers control to the interrupt 
code with the lolluwing state: 

• Inlerrupl system olf 

• Inlerrupl state collection disabled 

• Virtual memory system (both code and datal off 

• All access protection off. 

Since the virtual memory system is off, all memory for both code and data must 
reside in and be accessed by physical addresses Usually an operating system will 
put the interrupt handling cade in an area of memory that is "equivalents mapped " 
This means that the physical and virtual addresses are the same. This also means 
that code running in the shadow register environment cannot access memory with 
virtual addresses that are not equivalent since lo do so would require Ihe hard 
ware lo map the address using its TLB (translation lookaside buffer) t the hazard 
here is that Ihe required entry may not be in the TLB, which would cause a trap to 
the TLB miss handler Since traps are a lorm nl interrupt, life miss handler wuuld 
not be provided with the interrupt state (because the interrupt state collection is 
disabled) and thus would not know how to return to Ihe point of Ihe trap. 

On Ihe plus side, if the whole interrupt can be processed in the shadow register 
environment, the RFIR instniction is all that is needed to return to the point of 
interruption 

I The translation lookaside butler or TIB is a hardware address translation table Ihe TIB 
speeds up virtual-to-real address translations by acting as a cache lor recent translations. 



be dispatched without the PA-RISC processor having to wait 
for the VMEbus processor to fetch the interrupt vector. The 
current hardware always does the interrupi acknowledge as 
soon as possible but has the option of asserting the proces- 
sor inlerrupl either immediately or on completion of the 
interrupi acknowledgment 

Fig. '1 shows the steps involved in handling a VMEbus inter- 
rupi and Fig. 3 shows a portion of the system inlerrupl table 
which is used for handling second-level VMEbus interrupts 



Interrupting VMEbus I/O Card 

t, Send interrupi to VMEbus processor. 

3. Acknowledge the IAK and send an 4 
interrupt vector to the VMEbus pro- 
cessor. 



VMEbus Processor 

► 2. Send IAK (interrupt acknowledge) 
message to the interrupting device. 

4. Store interrupi vector at Ihe arbiter 
address. 



HP-RT Operating System Running on a PA-RISC Processor 



S. Interrupt HP-RT. 



E. Decode inlerrupl to determine which 
one ol 32 interrupt lines caused the 
interrupt. 

7 Use the result Irom step 6 to index 
into Ihe HP-RT interrupt table I a in 
Fig. 3). 

8. Since this interrupt is associated 
with a VMEbus device, the second- 
level interrupt table is accessed 

( b in Fig. 3). 

9. The second-level code ( c in Fig. 3) 
is responsible lor interpreting the en- 
tries in Ihe second-level interrupt 
(able. 



10. The code mentioned in step 9 per- 
torms the following: 

• Retrieves Ihe interrupi veclor that 
had been placed at Ihe arbiter ad- 
dress in step 4 ( d in Fig. 3). 

• Creales an index to Ihe interrupi ac- 
tion pointer by ANDing Ihe value in 
the mask entry I e in Fig. 3) with the 
interrupt vector. 

• Uses the index to find Ihe handler 
thai will process Ihe interrupi Irom 
the interrupting device { I in Fig. 3). 

• Transfers control lo Ihe handler. 



Fig. 2. An example of the VMKlms interrupt handling process. 
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HP-RT Interrupt Table 
Indexed by Ihe Bit 
Position in the 
Interrupt Word 



Single-Level Interrupt 
Table (For All I/O Except 
VMEbus Interrupts 



Interrupt Action 



Semaphore Address 



Driver-Defined Data 



Types ol Interrupt 
Handlers Called 



Assembly 
Driver 




Index 



© 



Interrupt Action 
Pointer 



Interrupt Bit 



Driver Address 



Interrupt Action 
Pointer 

Interrupt Bit 



Driver Address 




Second-Level Interrupt 
Table for VMEbus Interrupts 
(There can be up to seven 
ol these structures.! 



© 

Second-Level Code 



Arbiter Address 




I® 
© 




This group of entries is 
repeated 32 times (one 
group for each bit in the 
PA-RISC interrupt word). 



Interrupt Action 
Pointer 



Oriver Address 



Repeal for Mask 
♦1 Entries 




At this point the entries 
lor first and second 
levels are the same. 



© 



Assembly 
Driver 



Interrupt Action 




Semaphore Address 



Driver-Defined Data 




Fig. 3. The HI'-IiT interrupt table structure. 

and non-VMEbus inierrupts. Note Ihe correspondence be- 
tween the interrupt table structure and logical I/O architec- 
ture shown in Fig. 1. The three different interrupt handling 
situations mentioned above are taken care of by allowing 
one of the three types of interrupt routines to be specified in 
the table (see the interrupt action entry in Fig. 3). 



C Environment 

C errvirorrmem refers to the implied machine stale when executing in a C language 
orogram This machine state is really a set of register use conventions that are 
defined m the software architecture for the PA-RISC processors isee Fig 1 1 Some 
of the basic assumptions made m C about these registers include 

• Register 30 is the stack pointer and points at the first available double word on 
the stack The stack grows with increasing addresses 

• Just below the current stack pointer is a standard stack frame with room for the 
return address to be saved lif the callee needs to save iti and room for each of the 
call parameters to be saved 

• Registers 26, 25. 24. and 23 (as neededl contain the call arguments. It more than 
four arguments are passed, those above the first four arguments are stored in the 
stack frame 

» Register 27 is the global data register and is used to address any global data 
needed by the procedure 

• Register 2 contains the address to return to when the procedure is done 

» Registers 28 and if needed 29 are to contain the function result when the function 
returns. 

• Registers 3 through 18 (the callee-saves registers) can be used only if they are 
saved and restored before returning to the caller 

> Registers 19 through 22 (the caller-saves registers! and registers 1 and 31 are 
available to use as scratch registers. 

There are other conventions for floating-point and space registers which are 
usually not important |ri operating system code 

The shadow register environment, which consists of registers 1, 8, 9, 16, 17, 24. 
and 25. is not compatible with the C environment 



GRO 
GR1 
GR2 
GR3 



GR18 
GR19 



GR22 
GR23 



GR26 
GR27 

GR28 
GR29 

GR30 

GR31 



Zero (by Hardware Convention) 



RP (Return Pointer) 



Callee-Saves Registers 



Caller-Saves Registers 



Arguments 



DP (Global Data Pointer) 



SP (Stack Pointer) 



MRP (Millicode Return Pointer) 



Fig. 1. Register use conventions in the C environment 



Second-level VMRbus interrupts are handled by reading the 
returned interrupt vector, masking it, and using the result lo 
index to the inlerrupl action thai will handle the inlerrupi 
( ' in Fig. 3). The masking is done lo prevent indexing lo a 
location outside of ihe table and to allow ihe interrupting 
device to pass back status information in the high part of Ihe ' ""b largest value for n is 256 



word. The mask is computed at system configuration time 
from (he user's specification of the high number to be re- 
lumed on a given interrupt line. This number is rounded up 
to the nearest power of two (2"). For example, if the highest 
number lo be returned on a particular interrupt line is 12 
I hen n is four because 2 4 provides the nearest power of two 
greater than 12. t This results in a table that is larger than 
needed bill eliminates the need to check if the masked num- 
ber is loo large. I 'nused entries in both the first-level and 
Second-level Interrupt tables are filled with entries that 



© Copr. 1949-1998 Hewlett-Packard Co. 



Aiij>tist !!«):) f lewli-tt-l'iii kard .linininl 35 




result in system Illegal interrupt messages should such ;ui 
interrupt eteer happen. 

Initially, the nr-RT team wanted the inteirupl handler and the 
interrupt off times to be "blind" in interrupts for a maximum 
of Kill instruction limes, including any stall slates minus 
cache misses. The notion of blind to interrupts was intro- 
duced tO cover the case in v\hich Ihe syslem keeps Ihe inter- 
nip! system off. but slill processes the interrupt in a timely 
fashion. This occurs in Ihe inteirupl handler, for example, 
when after it processes an intei rupl il looks at the pending 
interrupts and if il finds one. processes it without turning 
on the interrupt system. The operating system interrupt 
dispal clung code met the lllO-inst ruction lime limit. 

Handling Large Contexts 

The PA-RISC architecture divides a program's context into 
two register sets: rtillrr-sririx and GClUee-SOVes registers. The 
caller-saves registers consist of registers that are expected 
to contain values that do not need to be preserved across a 
procedure call, that is. values the calling function does not 
care about. Therefore, these registers are available for use 
as scratch registers or for parameter passing b\ the called 
routine. The callee-saves registers are used for values thai 
must be preserv ed across a procedure call. Thus, if the 
called routine wants to use a callee-saves register, it must 
first save it and then restore it before it returns. The PA-Rist 
architecture also specifies where these registers must be 
saved on the call stack (see Fig. 1). This caller-saves and 
callee-saves convention is used by the FA-RISC compilers SO 
that the system can depend on il. 

III'-RT depends on the caller-saves and callee-saves division 
to keep context management code to a minimum, In particu- 
lar, on system calls Ihe system stives only the user's (callers) 

return address, global register, and slack pointer, the system 

call handler then calls the requested system call function 



Fig. 4. The relationships between 
(Unction {ta procedure) rails, the 

caller- and rallec-saves registers. 
I the star k area. The caller 

puis data ii wants co preserve in 

the callee-saves registers before 
making a cull, If the called routine 
h ailed needs I" list? any uf the 

eallcc-saves registers, it saves the 

value rniiiallicd in the register 
and restores the value back into 
the register before returning tO 
the caller. 

depending on that function to save and restore any callee- 
saves registers il may want to use. Likewise, on interrupts or 
traps where control must be transferred to Ihe kernel slack, 
only Ihe caller-saves registers need to be saved because 
HP-RT depends on callee-saves registers to be saved by any 
function called. Therefore, since the context switch code is 
called as a (unction, all it has to save are the callee-saves 
registers. By saving only what needs to be saved at each 
step, ihe system keeps the overhead low for register saves 
and restores. 

III'-RT also lakes advantage of Ihe fact thai Ihe floating- 
point coprocessor is enabled by setting bits in a control 
register. If the coprocessor is not enabled, the system will 
generate an emulation trap when a process attempts to use 
any floating-point instructions. Processes stall with the 
floating-point coprocessor disabled. When a process at- 
tempts to use floating-point instructions, the code in the 
emulation trap handler saves the old process's floating-point 
registers and loads the current process's floating-point regis- 
ters. In this way. the overhead of floating-point context 
switching is limited to only Ihe limes when il is needed. 

In deference to maintaining a low interrupt-off time, the 
system checks for pending interrupts once it has stored Ihe 
old process's floating-point registers. If any external inter- 
rupts are pending at this time, Ihe system will set the floating- 
point ownership flags to show that the coprocessor is not 
owned and then handle the interrupt. The current process 
will be redispatched still not owning ihe floating-point co- 
processor, but will Immediately end up in the emulation irap 
which will finish Ihe context switch. Of course the interrupt 
could cause the current process to lose the CPU, possibly 
even lo the process whose slate Ihe system just saved. For 
this reason, a Hag is kept lo show that the registers were not 
changed so Ihe process may proceed with only a quick pass 
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through the emulation code to get the coprocessor bits set 
again. 

Setjmp and Longjmp Solutions 

On rare occasions the operating system is required to abort 
a system call. This occurs when the user sets up a signal 
handle! and the signal handler is specified as requiring Ihe 
termination of any system call that is pending when the sig- 
nal is delivered. As mentioned above, the system takes ad- 
vantage of the fact that functions called on a system call will 
restore the callee-saves registers. These registers are saved 
on the stack by each function in the call chain, starting from 
the system call handler to the code that delivers the signal to 
the user. The problem then is how lo rec over these registers 
so the user code will have I he correct register set when con- 
trol is returned to it. The normal way to handle this kind of 
situation is to do a setjmp call lo save the callee-saves regis- 
ters in a local buffer and then do a longjmp call (which re- 
stores the saved registers) from the signal delivering code. 
The porting team decided that the overhead of a setjmp on 
every system call was too high. 

One solution that was considered was to identify all possible 
places in the kernel where such a signal could be delivered. 
Code could then be pul in place to do a setjmp only when the 
signal delivery was possible. This approach was abandoned 
when it was found that these calls could come from user- 
written drivers, The solution used is to unwind the stack, 
picking up each of the saved registers until the stack is back 
lo the system call handler. This solution lakes more time in 
the rare case of a call being aborted, hut does not put over- 
head in the path of all system calls. 

Hardware Help 

it was mentioned above thai the VMEbus hardware holds off 
interrupts until the informal ion needed to process the inter- 
rupt is available. The HP-RT team also requested and re- 
ceived a real-lime mode in the interrupt convention for on- 
board I/O device interrupts. The normal convention was thai 
all onboard device interrupts were collected into one bii 
leach hi! corresponds to one interrupt line). Under this con- 
vention the software interrupt handler would first decode 
the interrupt source to this hi! and then read an I/O space 
register thai contained a bil map of ail Ihe onboard devices 
requesting interrupt service. The hardware convention used 
was to clear this register when it was read. This required the 
software to keep track of all Ihe bits that were set and to 
call the handler for each bil. The software management task 
for this Convention would have been fairly high because the 
real lime system w ants the interrupt system on most of the 



time, which means that it is possible for another interrupt to 
be rec eived from another onboard device before the current 
interrupt is completely processed. At the same time, the rest 
of the main processor's interrupt register would noi be in use. 

The HP-RT team asked for an interrupt mode in which each 
onboard device has its own interrupt bit on whic h it can 
interrupt the main processor. This convention not only elim- 
inates the need to remember which bits were set. but also 
eliminates a level of decoding in the interrupt path. 

Conclusion 

One of the main goals of the HP-RT project was to minimize 
the time to handle interrupts. Table I. which shows the re- 
sults of these efforts, is a task response time line that shows 
the time consumed by each activity in the path from an in- 
terrupt lo the task (e.g.. user code) that does something to 
respond to the interrupt For cases in which an interrupt, is 
handled by an interrupt service routine in the operating sys- 
tem and not user code, ihe interrupts disabled and dispatch 
interrupts times shown in Table 1 are the only limes involved 
in determining the total task response time. Their worst- 
case times in this situation are 80 us and C us respectively, 
giving a total task response time of 86 us. The 80 us time is 
rare and work is continuing to reduce this time. 



Table I 

Time Line for HP-RT Running on the HP 9000 Model 742rt 
Tasks Performed Task Response 



After an External Event 


Best Case 


Worst Case 


Interrupts disabled 


0 


0 


Dispatch interrupts 


3us 


6 us 


Other interrupts 


0 


9 [ist 


Context switch off 


0 


166 ustt 


Scheduling and switching 


27 us 


45 us 


Return from system call 


1.2 [is 


4.6 [is 


Tolal lime 


31.2 us 


230.6 us 



I rhree interrupts 

1 1 This time is rare arid is in code other than the inlen upt and context switch code Work is 
continuing to reduce this time 



HP-UX is based on and is compatible with UNIX System laboratories' UNIX* operating system 
II also complies with X/Open's" XPG3. P0SIX 1003 1 and SVID2 interface specifications 
UNIX is a registered trademark ol UNIX System Laboratories llW in the USA and other countries 
X/Open is a trademark ol X/Open Company 1 irmted mine UK and other countries 
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The HP Tsutsuji Logic Synthesis 
System 



A new logic synthesis system has reduced the time to design ASICs by a 
factor of ten. 



by W. Bruce Culbertson, Toshiki Osame, Yoshisuke Otsuru, J. Barry Shackleford, and Motoo Tanaka 



Logir synthesis assists and automates the process of refining 
digital designs from high-level, abstract conceptions lo low- 
level, concrete specifications for physical implementation. 
The HP Tsutsuji logic synthesis system, a software package 
that runs on HP flOOO Series 700 workstations, was jointly 
developed by Hewlett-Packard laboratories in Palo Alto, 
California and Yokagawa-Ilewieit-Packard (VHP) Design 
Systems Laboratory (YSL) in Kunime, Japan. Tsntsini, the 
Japanese word for azalea, was adopted as the name of the 
product because at the inception of the project. Kurume was 
host ing the World Azalea Congress. 

Input to the Tsutsuji logic synthesis system is expressed as 
block diagrams composed of adders, multiplexers, shifters, 
register files, and so forth. Tsutsuji transforms these block 
diagrams into efficient, electrically and functionally correct 
netlists.t which can be implemented in various technologies 
(see Fig. 1 ). 

t A netlist is a list ol logic gates antl the interconnections, called nets, between lliem 



Block Diagram 




(inv(+A)(-A)| 

lfa(*C2«S1M+A1 +B1+C0)) 
|nor(-W|(*A1 +B1II 

• 

Fig. 1. Tsutsuji is ;i high-performance logic synthesis system. Designs 
are expressed as block diagrams, which are transformed by Tsutsuji 
into a netlist file that can be used by gate array manufacturers to 
produce an application-specific integrated circuit (ASK'] 



The most obvious benefit of logic synthesis is that it reduces 
the time needed to develop a new product. In a competitive 
market, the lime needed lo develop a product often has a 
greater influence on profitability than the product's perfor- 
mance or factory cost because of its effect on the technology 
potential in the product (see Fig. 2). In addition to shortening 
the design phase of the development schedule, logic synthesis 
can also reduce the debugging and testing phases by elimi- 
nating the errors that inevitably occur when a gate-level 
design is produced manually. 

A disadvantage of the traditional digital design process is 
thai designs are not captured precisely until they have been 
refined lo loo low a level of absl taction (Fig. :ia). At this 
point, technological dependencies have been introduced and 
high-level functions (Fig. 3b"3 have been obscured. Experi- 
ence has shown that these designs can almost never be re- 
used to take advantage of fast er and cheaper technologies 
when they become available. In contrast, Tsutsuji accepts a 
high-level, technology-independent design and automatically 
maps it to the target technology. Reusing an old design can 
be as simple as rerunning the synthesis tools. Freed from 
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^ „ One Year 






f ' to Market 






„ Four Years 




1 


to Market 

1 1 1 


0 


1 


2 3 4 






Years 



Fig. 2. Assume that the technology potential, which includes chip 
cost, speed, and density, grows exponentially. Then a project that 
can make it to market in oiip year will be implemented with a tech- 
nology thai will have four limes the potential of a project that takes 
four years to market. 
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AB 



AB 
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(b) 

Fig. 3. Designs thai are expressed directly in the technology of 
implementation (a) are often difficult or impossible to remap effect 
lively. Conversely, designs thai are created l'.V logic synthesis from 
high-level modules (l>) ai'e inherently easy to remap. 

the tedious low-level design tasks, a designer can devote 
increased time to tiie more profound system-level design 
issues, which can more significantly influence performance. 
Research into the art of implementing a specific function, 
for example a multiplier, needs to be performed only once 
to embed it into the logic synthesis system, after which it 
becomes available to all users of lite system. 

Logic Synthesis 

The initial locus of Tsutsuji is to assisl Ihe design of 
applieal ion-specific integrated circuits (ASICs). ASIC vendors 
typically provide low-level lools for placemen) and routing, 
rule checking, and so forth. Tsutsuji is intended to comple- 
ment and augment such tools rather than duplicate them. 
Thus, the output of Tsutsuji is the set of files a vendor needs 
to produce art integrated circuit. 

After it is entered with a graphical editor. Ihe block diagram 
describing the circuit is translated to a technology-specific 
neilist in two steps. In the first step, module generators, 
driven by parameters supplied from the block diagram, ex- 
pand the blocks into a generic netlist of simple gales. At Ibis 
stage, the gates have no restrictions on fan-in and fan-out 
and are essentially equivalent to logic equations. However, 
some modules such as multipliers can take advantage of 
higher-level primitives like full adders. If it is known at this 
stage thai the target technology contains these higher-level 
primitives, then the modules can be instructed lo emit them 



rather than the lower-level logic gates. This makes the task 
of technology mapping substantially easier arid quicker. 

During the second step, a technology backend manipulates 
the generic netlist into a new netlist that satisfies the design 
rules of the target technology (such as fan-in and output dri\ e 
restrictions) and exploits the technology's special features 

Module Generators 

The heart of Tsutsuji is a library of module generators, each 
of which can translate blocks of a single functional type into 
a collection of simple generic gates. The library contains 
module generators for all of the kinds of blocks that are typ- 
ically used to construct computer data paths and control 
logic. There are currently about fifty module generators, 
including: 



Adders 
Comparators 
Dividers 
Majority Logic 
Random Logic- 
Selectors 



ALUs 

Decrementers 

Encoders 

Multipliers 

Registers 

Shifters 



Counters 
Decoders 
Inerementers 
Multiplexers 
Register files 
State Machines 



It is important to stress that the library is not composed of 
fixed designs, as are standard cell and gate array libraries. 
Instead, it is composed of generators that can produce an 
endless variety of fixed designs. For example, blocks are 
synthesized with exactly Ihe desired operand lengths. By 
adjusting the parameters given to Ihe module generators, 
the designer tunes the synthesized circuit to achieve the 
project's cost and performance objectives. The speed of the 
synthesis process permits many design choices to be tried, 
with actual cost and performance data gathered for each. To 
produce a product upgrade, the current design can be reused, 
with blocks regenerated using synthesis parameters that 
yield higher performance. The new product is functionally 
equivalent to the first; consequently, the need for simulation 
and testing is reduced. 

Extensive literature exists describing Ihe implementation of 
data path and control logic functions, and much of this 
knowledge has been incorporated into the generators. Often 
there exist several algorithms that can be used to implement, 
a given function. For example, the module library includes 
ripple-carry adders, carry-lookahead adders, and conditional- 
sum adders. Multipliers can be synthesized using iterative 
cellular arrays or carry-save adder arrays. Best of all, the 
designer needs little understanding of the alternatives since 
all are functionally the same and since fast synthesis provides 
a quick comparison of cost and performance. 

Example: Shifter 

Once an algorithm is chosen, there often remain a number 
of structural choices thai can influence cost and perfor- 
mance. As an example, a 16-bit unidirect ional shifter will be 
considered in detail. The shifter has lti-bil input and output 
data buses. There are also four weighted shift -amount inputs 
and a shift-in input. 

In the case of the shifter, the library has only one algorithm — 
the shifter will be implemented as a collection of n-to-l multi- 
plexers. On the other hand, there are many possible struc- 
tural arrangements of the multiplexers that will produce the 
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Fig. 4. All eight possihii- organizations for a lii-dii unidirectional shifter are shown in this topology wpii mosaic The organizations derive 
from the factorizations of 10, the bit width. The facurizatlon (2 i 'J) results In a shifter composed ofa level of 2-to-i multiplexers followed 
by a level of -l-io-i multiplexers and finally followed by a level 6f24o»l multiplexers. 



desired shifter. For example, the Shifter could be structured 
;ls one level of sixteen 16-to-l multiplexers, or two levels, 
each composed of sixteen 4-to-l nudliplexers. Each factor- 
ization <>f the number sixteen yields a different way to struc- 
ture the shifter. For example, the factorization (2 IS) corre- 
sponds to a shifter with a level of 2-1 o- 1 multiplexers and a 
level of 8-to-l multiplexers. Pig. -I shows topology graphs of 
the first-level (generic gates) implementations of all eight 
possible organizations of a Ki-hil unidirectional shifter. For 
an explanation of topology graphs see "Nellisl Topology 
Visualization" on page 11. 

Table I contains data for a selection of structures for the 
shifter. The speed advantage of the ( Hi) Structure, which is 
significant in the technology-independent (generic gales) 
form, is not very pronounced after the CMOS technology 
backend corrects the excessive fan-in and fan-out. Good 
compromises between gate count and speed are offered by 
both (2 2 2 2) and (4 4); (2 2 2 2) may be favored in a tech- 
nology providing only two-input gates. The organization of 
the shifter is specified on the module's tuning page. The tun- 
ing page is made visible by selecting the module in the block 



diagram and then clicking on the tuning page button to the 
left of the drawing area Note in Table I thai the (4 4) organi- 
zation of the CM< >S shifter is only about four percent slower 
than the ( 16) organization and requires only 41 percent as 
many cells for implementation. 

To summarize, module generators provide designers with 
custom-produced functional blocks with exactly the required 
operand sizes. Designers can choose from a large number of 
functions. Given a function, a number of algorithmic and 
structural choices are usually available. 

Technology Backends 

The technology backends perform two functions: optimiza- 
tion and mapping. < (ptimization improves tlie cost and perfor- 
mance ofa circuit. Mapping converts the nellisl of generic 
gates produced by the module generators into an electrically 
correct netlist of gales that can be implemented m the target 
technology. Mapping is necessary because the module gen- 
erators use gates chosen from a fixed set of functions, which 
may be different from those available in the target technol- 
ogy. Also, the module generators assume gates ma> have 
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Table I 

16-Bit Unidirectional Shifter 
Organization 
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unlimited fan-in and fan-oui. The technology backends allow 
Tsiiisuji in realize an important goal; the ability lo implement 

one design efficiently in multiple technologies. 

Out experience with Tsulsini lias shown thai relatively simple 
backends are most effective. We have (jied other systems 
with far more sophisticated optimization features. These 
systems can considerably improve a poor design, although 
Often the result still leaves much lo be desired. For example. 



no system we have seen will convert a ripple-carry adder to 
a earry-lookahead adder. How ev er, if the design is nearly 
optimal to begin with, then the best optimizers can improve 
it very little. Furthermore, these systems are so slow, even 
w orking on small circuits, t lial I hey discourage the exper- 
imentation and iterative design approach thai we wish to 
promote. 

Tsutsuji designs are. in fact, nearly optimal before they 
reach the technology backends. Because the implementa- 
lion ol data-path structures like adders has evolved to a very 
high art. anil because our module generators have Captured 
that art, circuits produced by the generators typically hav e 
excellent fari-in. fan-out. and cost-performance characteris- 
tics. For control logic. Tsulsini uses generators that include 
their own Optimization algorithms. Since these blocks typi- 
cally contain relatively few gates, the optimization per- 
formed by the module generators is quick and effective. 

The mapping and optimization applied by the backends in- 
volve only small numbers of adjacent gates at a time. These 
transformations, called peephole optimizations, can be per- 
formed far more quickly than the global optimizations used 
in some other systems. Most of the transformations can be 
specified as rules, each of which is a pair of patterns. The 
design is searched for collections of gates that match the 
first pattern in a rule. The collection of gates is then re- 
placed with the second pattern in the rule (see Fig. 5). 

Agate with excessive fan-in must be replaced by the Tsutsuji 
backends with trees of lovv-fan-in gales thai implement the 
same function. To avoid increasing delay, nets on the critical 
path should enter the fan-in tree at its root while nets with 
plenty Of slack t can enter the tree at a deeper level. Fixing 
excessive fan-OUl is analogous: the net with loo many loads 
is replaced with a tree of buffers plus the original driver, 
which serves as the root. In Ibis case, loads should be driven 

t Stack 16 a measure nt how critical the liming is at a gate oi not. with imo slack being most 
Critical It is delmod as the difference between the lengih ot the longest path through the gale 
or net and the lengih nt ihe cntital path 
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Fig. 6. A single gale with exces- 
sive fan-in is replaced by a tree of 
gates by the technology baekend. 
1m 1 1 lis example Che fan-in limit is 
two. The shape of the tree and 
the points where the nets enter 
it are carefully chosen to avoid 

increasing delay, 



by gates at a tree depth Dpt greater than the slack through 
the load. An algorithm has been developed that builds opti- 
mal fan-in and fan-out trees. Optimal in this case means that 
no tree can be found that has less impact on the critical path. 
Fig. 6 illustrates the construction of an optimal fan-in tree. 

Human Interface 

As massive VLSI becomes more prevalent, a way must be 
found to manage Ihe complexity of million-gate systems on 
a Chip. We wish to elevate the designer's perspective by en- 
couraging optimization at the system level rather than at the 
gate or transistor level. 

A great deal of effort was put into creating a system that 
would both encourage system-level thinking and synthesize 
and map designs rapidly. To complement this system, we 
wished to design a human interface that would evoke the 
intuition and even the playfulness of the designer. ( )ur intent 



was that the designer would read the instructions after using 

the system. 

The analogy that the YSL design team chose for the Tsutsuji 
human interface was that of the engineer's design notebook 
(see Fig. 7). At a level above this is die concept of die library, 
which is simply a collection of notebooks and component 
c atalogs that can be used in any design. 

1 tesign notebooks are broken down into pages. The first page 
is the index page, by which all other pages can be accessed. 
As the design progresses, pages are automatically added to 
the design notebook. For example, in a hierarchical design, a 
number of lower-level components would be created. Each of 
these components along with the top-level design would then 
automatically be added anil appear in the index. Subsequent 
pages would be added to reflect Ihe results of technology 
mapping, timing, and topological analysis. 
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Fig. 7. Tsutsuji presents Ho- 
design as an engineer's design 
notebook. At the level above the 
notebook is a library' consisting of 
other notebooks and component 
catalogs. 
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Block Diagram Design Entry 

Nearly all substantial designs start out as block diagrams. We 
have chosen this natural form of expression as (lie principal 
form of design specification wit hin Tsutsuji. 

The design is entered by means of a block diagram drawing 
editor. The editor allows the designer to create, copy, move, 
delete, and connect graphical block diagram objects freely. 
A block diagram object can be a wire or bus connecting two 
or more modules, a module, or even a list of logic equal ions. 
Objects can be readily copied from other block diagrams in 
other design notebooks. To connect modules, the designer 
need only point at the appropriate connection points and 
Tsutsuji will automatically route the line. Modules that 
are already connected can be moved and Tsutsuji will 
automatically reroute the connections to the module. 

Hierarchical designs can be created by entering a design in 
the normal manner anil then putting the design in the design 
book where it can accessed via the Tsutsuji index page. Tsu- 
tsuji automatically constructs a symbol for the user. How- 
ever, fastidious users who want a more distinctive symbol 
can use the drawing editor to alter the symbol shape. 

Tuning parameters for modules are specified by first selecting 
the module with the mouse and then clicking on the tuning 
page button to the left of the drawing area A special page 
for the selected module will appear and then the parameters 
can be entered (see Fig. 8). 

Certain modules such as bus distributors, carry-save adders, 
and multiplexers require a different symbol depending upon 
their configuration. Rather than force Ihe user to specify the 
shape for each configuration, Tsutsuji has a class or symbols 
that are mutable — the form changes as a function Of the 
tuning parameters (see Fig. 8). 
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Fig. 8. The tuning page for a 
module is accessed by select ing a 
module with the mouse and then 
clicking on die tuning page button 
to the left of the drawing area. 
Some symbols such as the carry- 
save adder automatically change 
their form as a function of the 
tuning parameters. 



Textual Design Entry 

Usually the data-path portion of a design is most naturally 
expressed graphically. Text is sometimes appropriate, how- 
ever, for specifying the control portion of a design. Logical 
Description Formal, or LDF, is the Tsutsuji language for 
specifying designs lextually. LDF is similar to the C pro- 
gramming language, so it looks familiar to many users. 

To use LDF, the user places a box in die graphical design and 
connects signals to it. With the mouse, the user then executes 
a command that causes an editor window to appear. By typ- 
ing LDF text, into the window, the designer specifies the 
function of t he box. 

The first two examples in Fig. !) both specify the same func- 
tion, an adder, but do so using two different features of LDF: 
random logic and truth tables. 

The first line in Fig. (la lists the four signals that connect the 
adder subdesign to the rest of the design. The last t wo of 
those signals are pa'fixed with an ampersand to indicate Uiat 
they are outputs; the first two are inputs. The third line, 
which begins with Ihe word net, creates and names two wires, 
which will lie internal to the subdesign. Other internal signals 
will be created automatically if they are needed to implement 
the random logic expressions. The line carry = a & b;, as one 
might expect, creates an AND gate, connects its output to the 
signal carry, and connects its two inputs to the signals a and b. 

Fig. 9b shows a truth table for an adder plus LDF text that 
implements the truth table. The truthtable feature in LDF is 
merely a textual structure for expressing a truth table. 

The automaton structure in LDF allows the user to specify a 
state machine. Il consists of a list of states. For each state, 
expressions are given for Ihe outputs, and conditions are 
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Random Logic 

adder (a, b. &sum. Scarry) 

t 

net local net I, local not 2. 

carry = a&b. 

local nel 1 = -a & b. 

local net 2 = a & -b. 

sum = local nel 1 1 local net 2, 

) 

lal 



inputs 


outputs 


a b 


sum carry 


0 0 


0 0 


0 1 


1 0 


1 0 


1 0 


1 1 


0 1 



adder (a. b. &sum. &carry) 

i 

truthlable a. b sum. carry 
t 

caseO, 1: 1.0; 
case 1. 0 1. 0, 
case 1.1: 0.1. 
delault 0. 0: 

1 

I 

(b) 

StatoMachino (a. b. noxt. &0ut) 

i 

automaton I 
STATE_0: 
Out - 0; 

il (next! goto STATE 1. 
STATE 1: 
Out - ,i b, 
goto STATE 0. 

) 

I 

(e) 

Fig. 9. Mosi design entry is dona graphically in IfcutsujL I lowever, 
sonii' portions of designs an- more nalurally expressed using text. 
Tsuisiiji provides the I.DF language for ilii.s purpose; This figure 
gives three examples ofLDF: (a) specifies some combinational logie 
to implement an adder, (h) describes the same adder using a truth 

table, and (e) specifies a simple state machine 

given for changing to oilier stales. Fig. ftc illustrates a simple 
slale machine with two stales. 

Netlist Topology Visualization 

The topology graph is a new means we developed for view- 
ing a gate-level design. I 'nlike a traditional schematic, a 
topology graph can display a large design in a single window 
attd can make the performance characteristics of the circuit 
easy to understand. The topology graph also makes it easy 
to trace the automatically generated gates hack to modules 
in the user's high-level design. 

Fig. 1(1 is an example of a topology graph. Circuit inputs are 
placed in a column on the left side of the graph. The horizon- 
tal coordinate of a gate is set to he proportional to the delay 
of the longest path from the inputs to the gale. Registers ap- 
pear twice on the diagram. They are drawn first in the input 
column with only the register outputs shown (register out- 
puts are inputs to the logic gales). They appear again with 
only the register inputs drawn at the right-hand endpoinl of 
one or more paths through the circuit. Circuit outputs also 
appear at the right end of paths. 



A straight line is drawn hetween two gales if an output of one 
of the gales drives an input of the other. The brightest colors 
are used to show connections with the lowest slack. For ex- 
ample, the critical path in Fig. 10 is drawn in yellow. This 
emphasizes the pail of the circuit that limits the speed, which 
is usually the part of the circuit the designer most wants to 
see. Because delay information is inherently graph-oriented, 
we have found this graphical presentation of delay informa- 
tion to he an enormous improvement over the traditional 
textual delay report. 

Tsulsuji users typically make their high-level designs func- 
tionally correct before they bother (0 examine their designs 
at the gale level. Once (he design is functionally correct, 
there rarely is any need to look at the gale-level design in 
detail. Nevertheless, the topology graph program includes 
features for scrolling to any pari of the design anil zooming 
to any desired level of detail. 

A particular gate can he selected by clicking with the mouse 
or typing the name of the gale. The green circle in Fig. 10 
indicates a selected gate. Once selected, the gale can he 
brought to the center of the screen and magnified. A pop-up 
window of information about the gate can be requested; it 
gi\es information like the type and name of the gale, the 
gate's fan-in and fan-out. the slack at the gate, and so forth. 
The tree of signals driving the selected gate anil the tree of 
signals driven by the selected gale can be highlighted, as 
shown by the red portion of Fig. 10. 

( )nce a gate has been selected, il is possible to request a 
pop-up window showing the names of the gates that drive 
and are driven by the selected gale. Clicking on one of the 
names causes the corresponding gate to become the se- 
lected gale. This makes it easy to navigate through the 
design, following the circuit's interconnections. 

When the user types a name into a module selection win- 
dow, the named module is then highlighted, as shown in red 
in Fig. 1 1. This allows the designer to correlate blocks in the 
high-level design with gates in the gale-level design. The 
user can also request a pop-up window of information about 
the selected module. 

The ability to see a particular high-level module within the 
topology graph of the entire circuit is invaluable for setting 
module tuning parameters. For example, lite designer might 
use the mouse to select a gale on the critical path. From the 
gale information window, the user would team Ihe module 
from which Ihe gale was synthesized. Then Ihe user would 
select thai module lo highlight il on Ihe topology graph. If 
Ihe module were contributing significant delay to the design, 
the user might re tune Ihe module for higher performance. In 
another scenario. Ihe user might select a module thai was 
not on the critical path and relune il for a slower but 
cheaper implementation. 

Simulation 

To achieve our project goal of substantially increasing 
designer productivity, il was imperative to develop a fast 
simulator forTsutsuji. Traditionally, simulation has been a 
process for verifying designs lhat were nearly complete. 
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Fig. 10. The tni«,logy graph lets 
a Tsulsuji user view an entire 
gate-level design on a single 
screen. Signals flow from left to 
right in the diagram and the long- 
est paths horizontally have the 
must delay. The critical paths an' 
colored yellow A gale hits been 
selected as indicated by the 
green circle. Many features of the 
topology graph program relate to 
Lhe selected gate; in this case, the 
fan -in and fan -out trees for the 
selected gate have been high- 
lighted in red. 



Computers were left unattended for hours or days while 
simul ations rati a&d produced reams of paper. We wanted 
(he Tsulstiji simulator to aiil the early phases of design, pro- 
ducing results in real time and presenting litem in the con- 
text of the application. We wanted designers to he able lo 
experiment with significant design changes and see the 
effects instantaneously, 



A previous YSI. product included a simulator that evaluated 
about a thousand gates per second. The Tsutsuji simulator 
has achieved simulation rales as high as twenty-three million 
gate evaluations per second.) Some of this increase is the 

this was measured while simulating a 5000-gaie lloatmg point multiplier using an HP 9000 
Model I'iO computet 





Kig. 1 1. In this tOpOlOgy graph, a 
name has been typed into the 
Highlight Module pop-up wimluu 
This causes all the gates and in- 
terconnections within the named 
module to be highlighted in red 
A Module Inhumation window displays 
information about the highlighted 
module. Tbi' ability lo see how 
one module is situated wilhin the 
entire topology graph is useful for 
setting the module's tuning param- 
eters For example, if the eritieal 
path flows through lhe module, 
l lie designer may want lo tune it 
lor higher speed. Conversely, if 
no critical paths flow through the 
module, the designer may want to 
tune ii for lower cost. 
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result of an impressive leap in workstation performance that 
occurred between the releases of the t wo products. Several 
other factors also contributed. 

The most significant factor was the development of a special- 
purpose compiler that produces efficieiil simulation code. 
Much of the work performed on each simulation cycle by 
the previous simulator is now performed once during the' 
compiling phase. Also, Tsulsuji produces circuits thai adhere 
to strict design rules that make it possible to simulate the 
circuits accurately with a much simpler simulation strategy. 
For example, the delay in the circuits can be completely and 
quickly characterized by a separate sial ic timing analyzer 
program; hence, the simulator can ignore all liming issues. 
Since Tsulsuji circuits use simple clocking and two-state 
Boolean logic, each gale only needs to be evaluated once 
per simulation cycle. A gale typically can be evaluated by 
a single, simple, machine-level instruct ion on the host 
computer, 

When the user wishes to simulate a design, Tsulsuji displays 
the graphical simulation window. The user can choose buses 
to observe and can specify virtual instruments for driving or 
viewing values on the buses. Tsulsuji then automatically 
runs the simulation compiler ;uid starts the simulation and 
the virtual instruments. The simulalion program and the 
virtual instruments run as separate UNIX* processes that 
pass vectors through UNIX interprocess communication 
channels. This approach provides a flexible means for useis 
to add new virtual instruments. To do so, the user needs 
UNIX programming skill but does not need to know anything 
about the internal structure of Tsulsuji. 

Simulator Register Allocation 

One of the interesting algorithms developed for the simula- 
tion compiler is the register allocation strategy. Computers 
store data in memory and registers. Registers are scarce and 
fast; memory is abundant and slow. Register allocation at- 
tempts to minimize the movement of data between memory 
and registers and to maximize the amount of calculation 
that is done in registers. 

One of the first lltings the compiler does is transform the 
netlist into a list of instructions for a simple, idealized com- 
puter. These instructions arc similar in fund ion to instruc- 
tions executed by real computers and are simplified mostly 
in the way they refer to dam. Many optimizations that are 
complex to perform on real computer instructions can be 
performed easily and effectively on the simplified instruc- 
tions. The compiler removes the simplifications in several 
stages until, finally, the simplified instructions become real 
computer instructions. 

Typical ICs have at most several hundred input and output 
signals but have thousands of internal signals. In the simula- 
tion program, t he values of the internal signals are stored in 
temporary variables. In the list of instructions, there is a 
point where a temporary variable first appears and another 
point where it is last used. The number of instructions be- 
tween Ihese points is called the lifetime of the variable. 
Storage (memory or registers) can be used for multiple 
variables if their lifetimes do not overlap. 

A temporary variable is often used in many instructions, The 
first few instructions calculate the value of the variable, while 



the rest use the value to calculate other values. The number 
of instructions that use a variable is called the rrferaicr 
count of the variable. 

A variable's lifetime and reference count can be used to 
measure the desirability of storing the variable in one of Ihe 
scarce registers. If the lifetime is long and the variable is in a 
register, then many other variables are prevented from using 
the register. Hence, a long lifetime argues against putting a 
variable in a register. If a variable has a high reference count 
and is stored in a register, (hen many lime-consuming memory 
references are avoided. Thus, a high reference count argues 
in favor of storing a variable in a register. Combining these 
ideas, we define the cost of putting a variable in a register to 
be the variable's lifetime divided by its reference count. 

Our register allocation algorithm attempts to store low-cost 
variables in registers. During register allocation, the compiler 
liasses sequentiallv ovei the instruction list \\ hen a variable 
appears for the first time, it is assigned a register if its cost 
is low and a register is available; otherwise, it is assigned a 
location in memory. After a temporary variable appears for 
the last time, its storage becomes available again. 

One question remains: how should low cost be defined? 
Rather than try to choose a spec ific threshold to separate 
high and low cost, we use an adaptive strategy. Whenever 
the compiler tries to allocate a register to a low-cost vari- 
able but finds none available, the threshold Ls lowered. 
Whenever a high-cost variable is assigned to memory and 
registers are available, the threshold is raised. 

Our register allocation algorithm produces simulation code 
that runs almost four times faster than code that keeps all 
variables in memory. Yet, it is simple and requires minimal 
lime and memory while compiling. 

Virtual Instruments 

By providing a set of versatile virtual instruments, we hope 
to move the designer closer to the application domain and 
away from Ihe Boolean logic domain. Presently, Tsutsuji 
includes henchtop accessories and instruments that range in 
complexity from a simple on/off switch to a network ana- 
lyzer. These are all instruments that the user can interact 
with in a real-time fashion as the simulation is progressing. 
The high speed of the simulator makes the concept of virtual 
instruments practical and allows the designer to participate 
in an interactive environment 

Probe. Probes are automatically attached to all primary input 
;uid output nodes when Tsulsuji is placed into simulation 
mode. The user can optionally connect probes to internal 
circuit nodes to aid in monitoring and debugging. 

Switch. The switch (see Fig. 12) is a simple one-bit input 
port. It provides a convenient way for designers to interact 
with the logic simulation. 

Constant Generator. The constant generator (see Fig 13) is 
the equivalent of a potentiometer connected across a fixed 
voltage source and feeding an analog-to-digital convener. 
The degree of quant izal ion of the constant generator is auto- 
matically determined by the width of the bus to which it is 
connected. Just like a laboratory potentiometer, Ihe constant 
generator has coarse and fine adjustments: the outer ring on 
the knoli is the coarse setting and the inner ring acts as a 
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Fig. 12. Switches art a simple way for the user lu Interact with and 
control (he Simulation, the Switch is activated by use of the mouse. 
The name of the input port becomes the title displayed on t tic- 
switch panel 

vernier. For exact setting, the user can c lick on the displayed 
value with the mouse and then type the value from the key- 
board. The output can be changed between two's comple- 
ment and unsigned by clicking on the selector button. 

Function Generator. The function generator (see Fig. 14) is a 
means of applying stimuli to the simulator. It is modeled after 
a conventional analog signal generator. Multiple variable- 
period, variable-amplitude waveforms are available (e.g., 
sine, triangle, square, ramp). Data can also be read directly 
from a file. The function generator's output bus width (i.e., 
quantization) is determined automatical]; by the width of the 
bus tO wliich it is connected. The binary output of the func- 
tion generator can be presented in either unsigned or twos 
complement form. An additional useful feature is that the 
output of one generator can be used to modulate a second 
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Fig. 13. The constant generator provides a means for the user to 
vary inputs to the circuit wiiile the simulation is under way by sim- 
ply turning a knob. The resolution of the output is automatically 
determined by the width of the bus to which it is connected- The 
output can bo presented in either twos complement or unsigned 
integer form. 

one to create complex waveforms. The modulation includes 
amplitude, frequency, phase, and simple summation. 

Data Viewer. The data viewer (see Fig. 14) is a multimode, 
multichannel data display instrument. Each channel can be 
individually configured to display data as a conventional 
logic analyzer, as an oscilloscope, or in hexadecimal format 
Each channel can represent die data as twos complement or 
unsigned. The trace speed is variable and can be optionally 
controlled by an external sync pulse. The data viewer auto- 
matically increases the number of display channels as more 
input buses are connected to the instrument. Changing the 
size of the window automatically rescales the data display. 
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Fig. 14. Several fund genera- 

lnr.s connected to a data viewer. 
The lop trace shows an amplitude 
modulated waveform supplied by 
die tup two function generators. 
The function generator at the left 
is supplying the modulation signal 
for the generator to its right. The 
second trace is a frequency mod- 
ulated waveform supplied by the 
two function generators in the 
( (rnler. The next four traces show 
sun- and triangle waves in both 
oscilloscope mode and logic 
analyzer mode, 
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Fig. 15. ti»' network analyzer provides a swept frequency signal lo 

analyze a circuit's frequency response With respect to both phase 
and gain. 

Network Analyzer. The network analyzer (see Fie. i§} auto- 
inai ically analyzes a circuit's frequency response in terms of 
both phase and gain. The instrument provides a signal whose 
frequency is swept between the slail and slop frequencies as 
indicated on I he fronl panel. The scale of the display can he 
varied, as can the nature of the sweep (linear or logarithmic) 
and the number of samples to be taken at each step. 

Pixel Viewer. The pixel viewer (see Fig. Hi) provides the user 
with a virtual color CRT that can be configured to any geome- 
try and pixel size. There are a number of types ol pixel \ jew- 
els, but they fall primarily into two classes: those that accept 
a stream of pixels to be written in raster fashion and those 
that allow individual pixels lo be addressed and written. 




Fig. 16. The pixel viewer provides the user with a virtual color CRT 
thai can he configured to any geometry ami pixel size 



Examples 

Tsulsuji is now being sold in the Japanese market by Y11P. 
Customers have used TsuLsirji to implement a wide variety 
of ASICs ranging from digital signal processors lo control- 
lers lo digital TV systems. The largest design lo date has 
17(1,00(1 gales, although TsutSUji can easily handle designs of 
One-half million gates or more. The following examples illus- 
trate how Tsutsuji readily involves the user in llie domain of 
the application. 

Television Decoding Filter. Many Tsutsuji customers are in the 
business of designing telev ision receivers. Fig. 17 illustrates 
how Tsutsuji can be used lo make fundamental design deci- 
sions during the earliest stages of design. The example 
shows ait experiment to compare tWO TV decoder filters. 
One filter is less expensive to build than the other but pro- 
duces lower-quality results. Whether the less-expensive filler 
would be good enough is an aesthetic question I hat is almost 




Fig. 17. In this example, Tsutsuji 
was used to compare Iwo televi- 
sion decoder litters. A design was 

created that included both niters. 

I luring simulation, both filters de- 
coded the saute image, producing 
the two Images on the right side 
of tile screen The designer could 
then compare them with the orig- 
inal image, in the center of the 
screen t and choose die most 
appropriate filter 
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impossible lo answer without looking at the images the filter 
would produce. Tsutsuji. with its friendly simulation environ- 
ment, provided an ideal means for answering that question. 

A design was entered into Tsutsuji that included both filters. 
A switch, labeled Sel, lets the user switch between the two 
Biters during simulation. The function generator to the left 
of the switch in this example merely reads the original 
image from a file and feeds it to the simulation. The image in 
the center of the screen is the original image before encod- 
ing and decoding. The tall pixel viewer on the right displays 
the output of the simulation. The instrument labeled viewer 
has been plac ed in oscilloscope mode and shows the input 
NTSC television signal, (he signal after il is decoded into 
chroma ;uid luminance ((' anil Y|, and the signal again after 
it is decoded into red. green, and blue. 

The simulation was started with the switch set to select the 
lOW-quallQr filter The decoded image began filling the outpul 
pixel viewer. ( dice an entire image had been simulated, the 
high-quality filter was selected and another image was drawn. 
Once both decoded images were complete, the user could 
compare them with the original and make a well-informed 
dec ision about which filler to build 

Image Processing ASIC. Fig. IS shows an image compositor 
ASK thai was designed using Tsutsuji for an image process- 
ing system. The compositor ASIC merges two input images, 
producing one output image The images are merged using 
one of two modes. In the first mode, the input images are 
treated as though they were transparent, and the output 
image is a blend of the two images. In the second mode, the 
iltpul images are considered lo be Opaque. If two objects in 
the tWO input images overlap, then the object thai is closest 
lo Hie \ tewer is shown in the oulpul image. The image pro- 
cessing system includes a tree of identical compositor 



Fig. is. This screen thews a de- 
sign fur an image processing chip 
l luii was designed with Tsutsuji 
Tin- designer spent two hours en- 
tering the design, which was then 
automat irully synthesized into 
sfilM'i ( "M< )S gate array cells in 
less l Man a minute. Actual 
linages, appropriate in the image 
processing domain, were used 
when simulating the design. The 
tipper two pixel-viewer virtual In- 
struments SfiOW the input images; 
the lower viewer shows the 
blended image produced by Hie 
simulation 

ASICs. The tree has much the same function as an individual 
compositor ASH ' except that it combines many images (nol 
just two) into a single image. 

Fig. IS shows part of the compositor design and the result of 
Si mu la ti ng the design in its blending mode. The simulation 
inputs mid outputs are viewed as images so I hat the designer 
will neither waste lime interpreting the simulation nor risk 
misinterpreting it Three pixel-viewer virtual instruments 
can be seen. The I wo upper viewers show the input images 

and the third viewer shows the blended result The Simula 
lion, which required ev aluation of about 5000 gates for each 
of the !IO0 pixels in I he output image, was completed in less 
than a second. 

The compositor design was entered into Tsutsuji by ah inex- 
perienced designer in two hours. The design consisted of 
approximately thirty high-level modules. The high-level de- 
sign was synthesized into a design at the generic gate level 
in twelve seconds. Il look an additional Ihirty-eight seconds 
lo accomplish (he following: the design was mapped into a 
commercial CMOS gale array library, the mapped design 
was translated into the file formal that (he gate array vendor 
accepts, and an exhaustive delay analysis was performed on 
the circuit. The resulting design uses Kr>!it; gale array cells 
and Kill I/O pads. 

Low-Pass Filter. Fig, 10 illustrates a logic synthesis session 
thai has progressed lo (he point of logic siinnlalion. The 

example is thai of a simple low-pass filler. Instead of the 
streams of ones and zeros that are normally associated with 
iQgiC siinnlalion. we see waveforms — an appropriate form in 
which lo view the input and oulpul of a digital filler. 

The illustrated low-pass filler lakes a percentage id' the 
previous input and sums it with one minus that percentage 
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Kig. 19. A simple li.wpass Till it 
design example. The function 
generator on the left provides a 
high-frequency signal in lie added 
l.o the low-frequency signal of the 
function generator on the right 
The filler will remove varying 
amounts nf litis high-frequency 
signal as a Function of the percent- 
age feedback, which is controlled 
hy the constant generator, The 

logic simulation is performed 81 

the gale level so the real Circuit 
will perform exactly 88 observed 
on the data viewer. 



times the current output to form the next output. We can see 
thai the major design parameters are indeed parameters, so 
the designer can, for example, explore the effects of quanti- 
zation i>y changing the input hits width parameter and then 
resynthesizing the design — a process thai lakps less than a 
minute. The other major parameter is the percentage of the 
previous output used to compute the next output. Rather 
than laboriously type the constant and one minus the con- 
stant for each trial, the designer has added hardware to the 
circuit to compute these two values. The constant can then 
be applied front a constant generator and varied in real time 
while the simulation is progressing. 

After the design has been simulated to satisfaction, the final 
synthesis can be performed. Here, the actual binary con- 
stant selected during simulation w ill be entered. There is no 
need to remove the superfluous adder. Since both of its in- 
puts are now constants, till of its gates will be removed by 
the optimizer. The multipliers will also be affected by the 
optimizer, since each multiplier has a constant as one of ils 
inputs. The final task is then to select a particular technology- 
specific library and perform the technology mapping. For 
this example, a nine-bit filter, the initial synthesis resulted in 
a design of 1401 gates. After mapping and optimization, the 
design was reduced to 64!l gates. 

In this example, a designer who was familiar with filler de- 
sign (but not necessarily familiar with multiplier design) was 
able to enter and synthesize a design for a low-pass digital 
filter in about ten minutes. Subsequently, different bit-width 
designs were explored by simply changing the bus width 
parameter. To observe the effect of the feedback constant in 
real time, extra hardware was added to the design to save the 
designer's time. This hardware did not penalize the design 



because it was later completely removed by the optimizer. In 
an hour the designer was able to intuitively explore literally 
dozens of designs without becoming enmeshed in I he ittlri- 
cacies of gate-level design. Essentially all of the designers 
creativity and intuition was focused in the application 
domain. 

Conclusion 

Tsulsiui is a product from VHP in Japan that provides a set 
of fast and efficient tools for logic synthesis, simulation, and 
design visualization. The graphical nature of the human inter- 
face allows designs to be expressed quickly by the designer. 
Rapid synthesis and mapping encourage the designer to ex- 
plore the design space interactively in search of an optimum 
system configuration. Applying creativity where it will have 
Ihe greatest impact. Ihe designer remains focused in (he 
application domain, knowing that optimization and mapping 
into the chosen technology will be automatic. Designs 
produced by Tsutsuji are inherently reusable. 
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Designing a Scanner with Color Vision 



The challenge for personal computer imaging today is to duplicate human 
color vision, not only in scanners but also in monitors and printers so that 
colors look the same in all media. The HP ScanJet lie scanner uses a 
proprietary color separator design to provide fast, single-scan, 400-dpi, 
24-bit color image scanning. 

by K. Douglas Gennetten and Michael J. Steinle 



The function of a desktop scanner is In digitize ;ui image or 
a document and send the information to a computer, a lac- 
simile card, or a printer. This allows the digital inhumation 
to he processed, printed, and stored for archival purposes. A 
desktop scanner can be used for many different job func- 
tions and must he able to scan various types or documents, 
photographs, line-art drawings, and three-dimensional ob- 
jects that may be placed on the scanner platen. The wide 
variety of material that can be scanned presents challenges 
for the scanning device. 

HP ScanJet He Scanner 

The HP ScanJet lie scanner is a 400-dot-per-inch (dpi) flatbed 
scanner with black and while, color, and optical character 
recognition (OCR) Capabilities. It is compatible with i'Cs 

and Apple Macintosh computers and with desktop publish 

iug, presentation, and text recognition applications. It offers 
last single-pass scanning, easy-lo use software, prinl path 
calibration, a legal-sized platen, HI' AccuPage technology 
for text scanning, and low cost. Prinl path calibration opti- 
mizes the quality of the final output by compensating for 
differences in output devices and software applications, IIP 
AccuPage technology, when combined wilii a software ap- 
plication that supports it (such as Caere's OmniPage Profes- 
sional 2.11), uses special page recognition techniques and 
automatically sets the intensity to improve accuracy on text 
with nonwhite backgrounds. AccuPage also includes logic 

that joins broken characters. 

The ScanJet lie provides 8-bit grayscale and 24-bit color 
scanning capabilities. It uses an SCSI (Small Computer 
System Interface) for Macintosh computers and a dedicated 
SCSI adapter for PC-compatibles and MicroChannel PCs. 
Optimum brightness and contrast sellings are select eil auto- 
matically. Custom scaling is available in one-percent incre- 
ments. ( Inline help provides reference and tutorial informa- 
tion. An optional document feeder handles up to 50 pages 
automatically. 

HP DeskScan II. the image scanning software included with 
the IIP ScanJet He scanner, has a layered user interface for 
both beginning and expert users. Advanced functions are 
easily accessed as pull-down menus or floating tools, [mage 
editing software is included, and a live preview feature 
shows the results of changes immediately on the screen. 



Color Science 

The experience of color is universal, transcending cultures 
and oceans. This experience always has one common thread: 
there are three elements in the experience of color vision. 
The first element is a source of illumination, the second is 
an object being illuminated, and the third is a detector to 
measure the reflected illumination from the object. 

Illumination 

Humans and many but not all animals see electromagnetic 
energy falling between 100 and 700 nanometers as visible 
lii/hl. Any energy within this range radiating from an object 
will influence its color appearance. Sources of illumination, 
whether natural or man-made, are characterized by their 
spectral power distribution, that is. their strength along the 
electromagnetic energy spectrum between 400 and 700 
nanometers. The nature of this spectral distribution can pro- 
foundly effect the color of an illuminated object. A common 
illustration of this is the color shifts that occur under tung- 
sten street lights. An extreme example would be laser light: 
all objects that are not black are red when illuminated in red 
laser light. To have a good color observation environment, 
the source of illumination must be liroailboiid, that is, il 
must contain a relatively fiat and broad Spectrum of energy 
over the range of visible light. If any areas of the sped nun 
are weak or missing, il w ill not be possible to illuminate 
those portions of an object's spectral rejhclance character- 
istic. The fluorescent bulb in the IIP ScanJet He is designed 
with a mixture of phosphors lo produce a broad spectrum of 
light energy. 

The Object 

Photons from the source of illumination arriving at the object 
can be affected in one of three ways. They can be transmitted 
through the object, reflected from the object, or absorbed 
within the object (and reradiated as heal or. in the case of 
fluorescence, reradiated as light of a different wavelength). 
Reflection is most relevant to the human experience of color. 
Colored objects are characterized by their spectral reflec- 
tance distribution. A vast variety of spectral reflectance 
distributions are found in the natural world. 
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Objects viewed with scanners such as the HP ScanJet lie are 
usually in the form of documents, (hi the case of the HP 
ScanJet lie. a noteworthy exception is three-dimensional 
objects. The Scan-let lie's illumination, optics, and single- 
pass color separation make it unusually capable as a thiv. 
dimensional object scanner ) Colors found on documents 
are usually generated with offset-press inks or photographic 
dyes. These colorants come in four varieties: cyan, magenta, 
yellow, and black. With only these four colors to work w ith, 
very few of the spectral reflectance curves found in nature 
can be even approximately repnxluced. Fortunately, because 
of a phenomenon in human vision called uirlnmrrism. tliis is 
not necessary. Without metamerism, any picture containing 
grass would have to be created with chlorophyll 10 provide a 
matching color. 

The Detector 

In the case of human vision, all of the infinite degrees of 
freedom found in an object's spectral reflectance distribution 
are reduced to only three dimensions. This is the root of the 
phenomenon of metamerism. Because oflhis. colors can 
always be described with just three numbers. For example, 
a color can be described by three numbers representing 
amounts of red, green, and blue. The same color can be just 
as precisely and unambiguously described by numbers rep- 
resettling its hue. saturation, and lightness. Any of several 
other three-dimensional color systems could be used as well. 

Like the human vision system, the human hearing system is 
a spectral waveform processor. I'nlike the vision system, 
however, the hearing system retains all of the spectral con- 
tent of audible sound all the way to the brain. This provides 
a very important capacity: when one listens lo a chord 
played on a piano, one can easily discern lite individual 
notes composing the chord. Also, from the Character Of the 
sound, it is obviously a piano chord rather than an organ or 
Bute chord played from the same notes. An expert ear can 
even loll the brand and sometimes the vintage of the piano! 
In slark Contrast, the eye cannot see chords. A white paper 
illuminated with a yellow light can appear exactly the same 
as the same paper illuminated with a mixture of green and 
red light. The spectral content, observable with a scientific 

instrument, can be radically different while the appearance 

is identical lo a human. Il is this mammoth simplification 
(loss) of informal ion that allows us to reproduce the color 
Of grass green exaclly with only four inks or dyes. Unfortu- 
nately, there is a catch. This exact match is. strictly speak- 
ing, guaranteed under one and only one type of illumination. 
More on this later. 

From Man to Machine 

Scanners like the III' ScanJet lie bring the gift of sight to 
Computers- Producing any color image capture device such 

as this requires a partial duplication of the human vision 
system in the form of electronics and optics. The central 
task in this effort is the accurate description of the human 
vision system's method of converting spectral energy into 
IhKe dimensions of color. This was done many years ago. 
Around 1930, primarily for the incipient color television in- 
dustry, a group of people were tested lor their sensitivity Ik 
monochromatic wavelengths over the visible spectrum. 
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Fig. 1. < 'IK si am lai d observer rnlur matching curves. 

Each person adjusted the intensity of three lights until a 
match of the lest wavelength Was achieved. A series of such 
matches produced a set of three curves called the color 
matching functions. An averaged set became the interna- 
tional standard called the ('IE standard observer (see Fig. 1 ). 
These curves form the baSiS Of color television and the HP 
ScanJet He. 

The color matching functions of the standard observer can 
be converted into a new and equally valid set of three curves 
by multiplying ihe original curves by a .'i-by-! matrix. The 
I '.S. National Television Standards Committee (NTSC) 
adopted one such set of curves for use in color television 
(see Fig. ■>). This NTSC standard is used frequently by the 
computer graphics industry and was chosen for the design 
of Ihe III' ScanJet lie. To achieve a Spectral sensitivity 
matching Ihe NTSC curves, a combination of the spectral 
characteristics of all the optical elements must be consid- 
ered. Fur ihe ScanJet lie this includes the document glass 
platen, ihe lamp, ihe lens, three color separation Biters, 
three mirrors, and the pholosensilive charge-coupled device 

(CCD) detector. To duplicate ihe human color separation 

process, Ihe net combination of all these elements must pro- 
duce three color channels thai are directly related to Ihe 
standard observer through a 3-by-U matrix operation. 

The cunes shown in Fig. 2 illustrate ihe ideal camera sensi- 
tivities for NTSC color television. Nole Ihe presence of sev- 
eral negative lobes. Because oi these lobes, a perfect cam- 
era would require more than three detectors (adding one for 
each negative lobe) and in fad Ihe very high-end broadcast 
Cameras Often have five or six detectors instead of Ihe three 
found in home video cameras. The inability lo include nega- 
tive lobes slightly diminishes the accuracy of the color Sepa- 
ration process. This degradation also exists for color film. 
The result is "instrument metamerism'': some colors that 
match w hen v iewed by a human observer do not match 
when viewed by Ihe instrument, anil vice versa. 
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Fig. 2. NTSC color matching curves, 
HP ScanJet He Color Separation 

Of all of I he elements along Ilif optical path of the IIP Scan- 
Jet He. the lamp and the fillers have the most conveniently 
alterable spectral behavior, and in the case of the dichroic 
filters, this is very rest ricted. Because of the color separa- 
tion method used (see "Color Separator Design." page 55), 
each color channel has access to three mutually exclusive 
bands of the color spectrum. The curves in Fig. 2 (and any 
other set of color matching Curves) contain a great deal of 
overlap. Some wavelengths are visible to more than one 
channel. Only a small amount of overlap is possible with the 
method used in the ScanJet lie, resulting in a slight degrada- 
tion of the color performance. However, this color separa- 
tion configuration has strong advantages in scanning speed 
and single-pass operation. Fortunately, the degradation made 
unavoidable by this configuration is small and is minimized 
through the optimization process described in the next sec- 
tion. The HP ScanJet He's color performance competes well 
with the other desktop scanners in the marketplace. 

Measuring and Optimizing 

The lamps in the HP ScanJet lie are fluorescent. They are 
produced with a custom mixture of phosphors that are spe- 
cifically designed to aid in the recreation of the NTSC spec- 
da] sensitivities. This ability to create custom spectral char- 
acteristics (see Fig. 3) helps offset the limitations of the 
filters. The color separation filters are a dichroic design (see 
"Color Separator Design." page •">). Their spectral charac- 
teristics can be altered primarily by moving the crossover 
frequencies. They have a fairly square passhand perfor- 
mance that does not match the shapes of the NTSC curves 



very well. However, the combination of the filters and the 
lamp produces a much closer approximation of the desired 
result. Extensive measurement and characterization of the 
scanner was performed using a spreadsheet model of all the 
spectral characteristics throughout the optical path. This 
model was used to optimize the choice of lamps and filler 
crossovers. Additional optimization was achieved through 
the selection of a carefully determined default .')-by-:J matrix 
which is applied to all scanned pixels. This :i-hy-.'i matrix 
provides a closer approximation of NTSC color. 

Color Matching 

The low-cost color scanners and printers available today 
contribute to a growing demand for accurate color image 
reproduction, I Isers of desktop systems having color image 
capture, display, and printing capabilities are demanding 
better color image reproduction fidelity. Many factors con- 
tribute to the challenges of color matching. 

Scanner Limitations. Scanner inaccuracies are mosl com- 
monly caused by imperfect color matching functions in the 
color separation process. Another less obvious source of 
error is that typical document scanners provide their own 
light source. Any color scan from such a device can only- 
give color measurement data for documents viewed under 
that particular light. Once the original document's spectral 
reflectance is reduced to the three dimensions of color, it 
cannot be reversed. The necessary information required for 
accurately determining the document's color under a differ- 
ent light source is irretrievably lost. This is true even for a 
scanner with perfect human-like vision and is unavoidable 
without increasing the number of dimensions (sensor col- 
ors) within the scanner. The result is that all color matches 
are conditional. They may, and often do, fail when the view- 
ing conditions are changed. The only way to produce an 
invariant match — one that holds regardless of viewing con- 
ditions — is to capture and reproduce not the color of the 
original but its spectral reflectance. Scanners and color 
printers are not capable of this loday. 

Monitor Limitations. < olor monitors produce a wide range of 
colors by mixing three different colored phosphors. Espe- 
cially in a well-lighted office, these monitors are limited in 
their ability to recreate the range of visible colors. First, a 
three-gun monitor, no matter how perfect, can never recre- 
ate the colors of the rainbow or any of a large region of 
other saturated colors. Second, because of the surrounding 
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Fig. 3. Spectrum Of the lamp in the HP ScanJet lie scanner. 
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light, a typical monitor cannot produce a good black. Third, 
such a monitor has difficulty producing a pure, bright white. 
These last two points can easily be illustrated with a com- 
puter monitor and a laser-printed page. When the printed 
page is held near the monitor, it will typically appear 
brighter and whiter titan the monitor's white. If the monitor 
is turned off (to reveal its blackest black) the black toner on 
the page will typically be much darker than the black of the 
monitor. An accurate reproduction of a monitor display of a 
white page with a black and yellow square would produce a 
printed page with many dots in the "white" areas, magenta 
dots in the "yellow" areas, and w hite dots in the "black" 
areas. This is rarely the desired result. WYSIWYG ( what you 
see is what you get) is definitely not desired. Instead, it's 
"what you want is what you get" that is desired. 

Printer Limitations. Further compounding the problems of 
color matching is the color gamut limitations of low-cost 
color printers (a printer's color gamut is the set of all of the 
colors it can print ). Many displayable and scannable colors 
fall outside of the capabilities of most printers. Areas of 
images that contain these c olors mast be modified to ac- 
commodate the printer limits. Once again, the most accurate 
reproduction is often not I he most desirable. 

Managing all of these color matching issues and limitations 
is a very complex task. However, advancements continue to 
be made, and (here is reason to hope for steady improve- 
ment in the disquieting siluation that exists today on the PC. 

Color Separator Design 

The objective of a scanner is to digitize exactly what is on 
the document that is being scanned. However, this is not a 
realistic goal because it would require a CCD (charge 
coupled device) detector with an infinite number of pixels 
and a lens with a modulation transfer function (MTF) equal 
In 1.0, which does not exist. (Modulation transfer function is 
a measure of the resolving power or image sharpness Of the 
optical system. Il is analogous to a visual lest that an optom- 
etrist would use to determine a human eye's resolving 
power.) Most important, the scanner user does not require 
ail exact reproduction of the original because the human eye 
does not have infinite resolving power. The HP ScanJet lie 
scanner is designed to obtain very fine-detailed images for a 
variety of color and black and white documents and objects 
thai are typically scanned. 

To design a high-performance, low-cost desktop scanner 
required a team effort involving the disciplines of optical, 
mechanical, electrical, firmware, and software engineering. 
Some key decisions that affected the design architecture 
were resolution (dots per inch), gray level depth, optical 
scanning resolution, scan lime, product size, image quality, 
and product cost. 

After the product was defined, a color separation technique 
was decided upon. This affected all the engineering disci- 
plines involved in Ihe product design. Various color separa- 
tion techniques are used in ihe image reproduction industry. 
A few of ihe coiiinion techniques are; 
• Colored dyes deposited on the CCD substrate. Used in 
Camcorders, scanners, and color copiers. 
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• Rotating or translating red, green, and blue fillers. Used in 
scanners. 

• Red, green, and blue flashing lamps. I feed in scanners. 

• Beam-splitting prisms with multiple CCD sensors. Used in 
scanners. 

To meet the performance and cost goals for the IIP ScanJet 
lie, a new HP proprietary color separation method was de- 
veloped and implemented. The initial developmenl was done 
at IIP Laboratories in Palo Alto, California and the technol- 
ogy was transferred to the Greeley Hardcopy Division in 
Colorado for continued development and implementation. 

The color separation system consists of a lens, two color 
separators, and a CCD detector as shown in the photograph, 
Pig. 4. Kach color separator is a laminated assembly as 
shown in Fig. 5. Each assembly is made of three glass plates 
thai are bonded to each other with a thin layer of optical 
adhesive. Red, green, and blue reflective coatings are depos- 
ited on Ihe glass before lamination. Specifically, dichroic 
coatings (2 to :i UHI total thickness) are deposiled onio Ihe 
glass substrates, Good spectral performance is obtained using 
dichroic coalings, result inn in an accurate colorimcltii- 
device. 

The distance between colors at the CCD detector (see Pig 5) 
depends on the thicknesses, index or refraction, and angles 
of Ihe glass plates separating the red, green, and blue reflec- 
tors. Tin' plates are thin glass subslrales Ihal have lightly 
controlled flalncss, thickness, and angle tolerances. The thin 
plates are laminated bp a thick baseplate, which provides 
mechanical rigidity and llalness. During the multilayer di- 
chroic coating process Ihe thin plates are distorted, but lam- 
inating them to the thick plate restores the llalness of the 
reflective surfaces. The first laminated plate has the color 
order of blue, green, red while the second plale lias Ihe or- 
der of red. green, blue. This configuration equalizes Ihe opti- 
cal path lengths to ensure .simultaneous focus for all three 
colors. The order of coatings was selected lo maximize 
spectral efficiency and simplify the coaling process. 

Bach color component is focused onto a CCD row. each row 
consisting ol'.S 100 imaging pixels (additional pixels are 
available and are used for light monitor control and dark 
voltage Confection ). The CCD generates a voltage signal that 

is proportional to the amount of light hidden! on the detec- 
tor. This signal is processed and then digitized, Hav ing a 
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Fig. 5. HP ScanJet lie color 
separation method 



CCD I lull integrates all three rows and senses all three colors 
simultaneously yields a single-pass scanner with excellent 
image quality. This color separation method also provides 
high-performance scanning capability in a small integrated 
package that is cost -effective and maiuifacturahle at high 
volumes. 

A layout of the optical system showing the light path is 
shown in Figs, li and 7. Fig. (> also shows the solids model of 
the carriage, which includes the dual lamp assembly, three 



minors. Ihe lens, I he color separator, and the CCD assembly. 
The carriage is translated along the length of the document 
glass platen by a stepper motor drive system and a bell that 
is connected to Ihe carriage. In Fig. 7 Ihe light path is drawn 
for several rays from the scanned region. The lens is a six- 
elemenl double Gauss design that yields a very good MTF. 

The optical system was designed and evaluated using a 
commercially available optical design program. I 'alike many 
Other engineering disciplines such as finite element analysis, 
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Fij;. 6. Solids model of the HP ScanJet I! optii ,il path and carriage 
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for which it is more difficult to predict accurately how a 
fabricated prototype will perform, the performance of an 
optical system can be calculated very accurately. The effects 
of tolerances on the optical system were also modeled to 
ensure that the product could be manufactured at high vol- 
umes. Modulation transfer function (image sharpness) was 
evaluated for tolerances such as lens centering, till, accu- 
racy of lens radii, index of refraction, and color separator 
flatness and thickness. A typical plot of modulation at 10~> 
line pairs per inch (object Side of the lens) as a function of 
position across the page is shown in Fig. 8. Modulation is 
the sharpness of the image at a specific line pair frequency, 
whereas MTF is the sharpness of the image as a function of 
line pair frequency. Fig. 8 demonstrates that the resolving 
power of the scanner varies only slightly with the location 



on the glass platen. This data includes the effect of the 
CCD's modulation: 

Modulation = Modulationopiu-s x Modulation ■<]>. 

For fabricated optics tested on an optical bench, the mea- 
sured through-focus data agreed closely with the calculated 
results. 

To achieve precise optical alignment, custom tooling was 
designed and fabricated to meet production goals. Transla- 
tional alignment of ±10 urn is required for focus and for cen- 
tering the light path on the CCD. The alignment tools, consist- 
ing of translational and rotational stages, are controlled with 
an IIP Vectra :i8(j computer and software that consistently 
gives optimized optical alignment. 
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and an MBA degree from Northeastern University 
(1987) His other professional experience includes 
work on operating systems at Wang Laboratories. 
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MSME in 1984 He's the coauthor of four articles re- 
lated to the work he did on blood flow through a 
heart valve prosthesis while he was at Purdue His 
work has resulted in four patents on color separator 
design and scanner optical systems Mike is married 
and has two daughters He volunteers at a local youth 
center and is active in his church His leisure activi- 
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learning GUI programming on his home computer 




Daniel G Maier 

A software engineer at the 
Imaging Systems Division. 
Dan Maier |Oined HP in 
1989 He was born m Roch- 
ester. New York and is a 
graduate of Rensselaer Poly- 
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Mechanical Considerations for an 
Industrial Workstation 



Besides being a compute and data processing engine, a workstation in an 
industrial and measurement environment must be mechanically designed 
to handle the special requirements of these environments. 

by Brad Clements 



The HP iMKHl Models 745i and 747i arc entry-level industrial 
workstations These systems are designed for test and mea- 
surement, industrial process eonlrol, and electronic testing 
applications. Both machines are based on HP's PA RIS! 



version 1.1 architecture. 1 and they both run the HIM'X !UI 
opera! inn system. Except for dimensions and EISA and VME 
slots, both machines provide the same features. Fig. 1 shows 
a rear view of the Model 745i and 747i workstations. 




Mass Storage Module 
(Default Back Access! 



Two EISA Expansion 
Slot Modules 



Internal Mass Storage 
SCSI Cable Connection 



SPU Module 

SGC (Graphics) 
Module ~ 



Six VMEbus _ 
Expansion Slots 



IN 




Fig. 1. Roar views nf HP till! Hi 
Series 7lllli industrial work- 
stations (a) Model 745i. Overall 
size 176.75 mm high by 425.45 
turn wide by 412, (i mm deep 
(6.97 inches by 16.7E Inches by 
16.2 inches), (b) Wallmoimted 
Model 747i. Overall size 310 |6 
mm high by 425.45 mm wide by 
412,6 mm deep (12.21 inches by 
16.75 inches by 16.2 inches). 
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Background 

Al the beginning of the investigation phase for the industrial 
workslation project, a team from R&D and marketing set 
out to answer the question "what makes an industrial work- 
station different from a standard workstation?"! Dozens of 
customers in the measurement and industrial automation 
markets were visited to help us understand their needs thai 
go beyond the features provided in Hi's line of standard 
workstations. This article addresses the mechanical design 
aspects of the differences between standard and industrial 
workstations, and I he design st rategy we used to meet I he 
needs of customers in Ihe industrial marketplace who use or 
could use engineering workstations. 

Serviceability 

Unlike standard workslations, industrial workstations are 
intended lo be incorporated in large, very complex manufac- 
turing processes that produce products worth extremely 
large amounts of money per hour. The cost of downtime 
demands the highest level of serviceability. Trade-offs for 
cost lhal compromise serviceability cannot be made. Our 
goal was lo provide access to all service-level components 
in less than three or four minutes. 

All service-level components in Ihe Model 745i and 747i in- 
duslrial workslations including the backplane can be re- 
moved and replaced from the cable end of the compuler 
while Ihe computer chassis remains mounted in the rack. 
This feature sets a new standard for serviceability in this 
industry. To make Ihe serviceable modules, or bricks,tt easy 
lo remove, an extractor handle was developed which holds 
a captive spring-loaded retracting screw (see Fig. 2). The 
handle provides a trigger grip for the index finger and a fiil- 
cnim surface for Ihe thumb when removing adjacent bricks. 
The handle also provides a surface lo push on while seating 
Ihe bricks. Regulatory compliance dictated Ihe use of a tool 
to remove all bricks. The captive screw, which is housed in 

I A standard workstation 15 one lhal is typically used lot program development or running 
application programs le g CAD/CAM, desktop publishing, etc.). 

It A brick is the term we use lor alt Ihe modules designed lor the Model Mbi and 747i 
workstations 



Fig. 2. ( T'l ' brr-k showing the 
extractor handle. 

the handle, visually pops forward to indicate to the operator 
lhat the brick is unfastened. Once the bricks are removed an 
internal wall (see Fig. 3) swings up to unlatch so thai it can 
be taken oul of the cabinet to allow the customer to remove 
the backplane by undoing a single captive fastener located 
on the backplane. 

Connectivity 

In addition lo Ihe robust core I/O capabilities offered by 
HP's standard workslations, Ihe Models 745i and 74 7i pro- 
vide an HP-IB interface as pari of the core I/O. To provide 
I/O functionality thai goes beyond lhal offered as core I/O, 
expansion slots are provided. The number of slots requested 
for industrial workstations is not only greater than for stan- 
dard workstations, but t he types of I/O slots are mixed. Be- 
sides the core I/O, the current HP standard workstations 
only provide EISA slots, which support several I/O proto- 
cols.- In addition lo supporting EISA slots, ihe Model 747i 




Fig. 3. Gaining access to tin 1 Model 747i backplane by removing the 
internal wall 
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also supports VMKhus. The package for these machines was 
designed lo he large enough lo he able to house Hie larger 
cards such as VXIbus cards. t 

Support Life 

Support life is a very important consideration to the indus- 
trial automation customer. Once an industrial workstation 
has been designed and installed into a factory' process it is 
rarely replaced or upgraded for reasons other than loss of 
support. Support life is not something that is designed in. 
hut rather a promise or commitment made to customers by 
IIP. The current standard workstations are supported for 
five years while the Models 745i and 747i carry a 10-year 
comniilment. To reflect a long support life, the industrial 
design of the Models 7-lf>i and 7171 has a much plainer and 
timeless look (see Fig. 4) than the new line of standard 
workstations. 

Reliability 

In many standard workstation applications the hardware 
becomes obsolete long before physically wearing out because 
of reasons such as the availability of lower-cost machines or 
machines with faster graphics engines. With industrial work- 
stations this may not he the case because certain items like 
the Ian may not have the same 10-year or even 20-year life 
that a factory installation may have. For example, extensive 
testing was done on fan bearing systems lo select the best fan 
for the Models 74~>i and 747i. but the life expectancy of the 
fan is still not greater than the service life of the workstation. 
Thus, the power supply carries a fan-tachometer signal and 
an overt emperat urc signal, and is serviceable. More details 
relating to fan and airflow reliability are discussed later in 
this article, 



1 As ot this writing VXIbus cards aie not yet supported in the HP 9000 Series 700i machines 




Fig. 4. Rackmounted Model 747i with rtoncablc end oui 




Fig. 5. Rackmounted Model 7471 with cable end out, 



Graphics 

In a typical standard workstation configuration only one 
large color display needs lo be supported because the user 
is able to access multiple applications using windows. How- 
ever, in some industrial automation environments, industrial 
workstations are required to support several large graphics 
displays. For example, in a control room application large 
monitors are used to replace walls full of critical instrument 
gauges. The user or control room operator needs to monitor 
more gauge images than can be seen on one monitor screen 
without paging through windows. Windows are still needed 
for less critical gauges and other operations, 

Front -to-Back Reversibility 

For measurement automation customers, the business end 
or user interface end of Che Models 74SA and 747i is the non- 
cable end of I he package (see Fig. 4). All the cables and chil- 
ler are hidden in the rear of the machine inside of the rack- 
mount cabinet, which has an access door in the back. 
User-accessible mass storage bays. ;ui on/off switch, and 
diagnostic LEDs are located at the front end of the machine, 
which is the most cosmetic surface of the product. 

On the other hand, the industrial automation customer typi- 
cally Wimts the cable end of the machine to be the user inter- 
face end of the product, wilh the diagnostic I.KDs, on/off 
switch, and user-accessible mass storage bay also located at 
the cable end of the machine (see Fig. ">). The Models 748J 
;uid 7473 were designed lo allow IIP manufacturing to con- 
figure the computer to meet the needs of both Hie measure- 
ment automation and the industrial automation customer. 
Fronl-to-back reversibility is provided by redunikuit on/off 
switches, redundant diagnostic I.F.I )s, and a mass storage 
brick that allows user-accessible devices to be located at 
either end of the product. 

Mounting Options 

Standard workstations are designed to live in an office envi- 
ronment with the workstation cabinet sitting under a moni- 
tor on a desktop or as a minitower on the floor beside the 
desk. The industrial workstation is required to live in rack- 
mount and other mounted environments. The Models 7 l"'i 
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and T47i can be mounted ill a variety of different configura- 
tious. They can be rackmounted front the cable end. rack- 
mounted from the noncable end. slacked on a bench with 
other IIP products, wallniounted with cables facing out front 
the wall, or mast mounted close to the center of mass of the 
product (see Fig. (5). 

Package Form Factor 

In a rackmounl environment . package height is always im- 
portant Co instrument and measurement automation custom- 
ers, but perhaps more important to industrial automation 
customers is the package depth. The Models 745i and 747i 
are designed to fit inside a 450-mm ( 17.7-in) deep wall- 
mounted cabinet with the door closed. With the front bezel 
removed the distance from the mounting wall to the I/O con- 
nector surface is 355 mm. leaving a 95-nun depth for cables. 
The height of the package was driven by the nearest even 
number of rackmounl units thai a 120-tnm fan mid line filler 
stack would 111 in. Willi feel removed the Model 745i is four 
EIA (Electronics Industries Association) standard rack units 
(177 mm ) high and the Model 747i is seven EIA standard 
rack units (:!10.4 mm) high. The width of the package is 425 
mm to allow nonrackmounled stacking on a lab bench with 
other standard IIP 125-nuu-wide instruments. 



Airflow Management and Acoustics 

The IIP acoustic noise goal for office environment products is 
-50 dHa maximum sound power level. Standard workstations 
struggle to meet this goal while not making thermal compro- 
mises. Industrial workstations can be found in control room 
or factory-floor environments whic h can be wanner than a 
typical office. The variety of mounting options provided by 
the Models 745i and 747i introduce airflow inlet constraints 
not required of standard workstations. To provide more 
thermal margin at higher temperatures with constrained 
airflow inlets, the 50 dBa goal was compromised. The Mode! 
745i noise level is about 54 dBa and the Model 747i noise 
level with two fans is about 57 dBa. 

The Models 745i and 747i incorporate a negative pressure 
airflow design. (Mike a positive pressure airflow design, 
which allows airborne particulates to he filtered out through 
an inlet filler, the negative pressure system has no filter. 
Small inlel fillers fill wilh airborne particulates in a rela- 
tively short lime, greatly reducing the volume of air that 
moves through the product. Experience has shown that 
these small fillers do not get cleaned as often as required 
and lead to system reliability problems. Rather than filtering 
dust, the negative pressure design passes most dust through 
the product The dust that does collect over time inside the 
product is far less detrimental than a clogged filter. For ex- 
tremely dusty environments the product should be housed 
in an enclosure thai provides air filtering on a scale that can 
adequately and reliably filler airborne particulates. The neg- 
ative pressure approach offers some additional benefits. 
First, a much larger inlet area is possible which reduces 
total airflow impedance through the product Second, an 
uninterrupted airflow /one in front of the fan introduces 
more laminar airflow to the fan blades, which reduces 
acoustic noise. Finally, airflow is more uniform. Having 
more options for inlel locations provides better airflow 
rationing throughout the product. 

When viewed from the cable end of the product, the main air 
inlel is on the left side Of the produCI (see Fig. 7). In an in- 
dustrial automation installation the left side typically has far 




Fig. 7. Mass storage brick with 
user-iiici'ssiiiii' device* Ideated al 

Hie I'iibli- i'IkI \Isii shown an- the 
air inlets anil the carriers that 
hold (he mass storage devices in 

plan' 
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fewer cables than the right side. This relatively small num- 
ber of cables on the left side of the product creates little 
airflow impedance. 

In addition to the inlet holes on the left side, inlet holes are 
provided on the front of the product. The front holes are 
redundant, allowing the air inlet on the left side to be partly 
restricted as in a very tight rack installation with little plenum 
space on the sides. Air flows across the bricks and into the 
power supply. In an industrial automation installation, the 
cables that come into the system rack and lie along the right 
side of the product can be so numerous that airflow through 
them can be difficult. Therefore, the air exhaust designed 
into tin' Models 71~>i and 747i is out the cable end of the 
product through the power supply (see Fig. 8). 

The power supply is equipped with a temperature sensor 
that is located near the exhaust fan. This sensor controls the 
fan speed and is located downstream in the airflow path so 
that the fan will speed up when the system is heavily loaded, 
the ambient air is relatively warm, or the inlet is partially 
restricted. The airflow through the Model 74oi is a generous 
5(> fl'Vuiin at low speed and 70 ft'/niin at high speed. The 
Model 747i with two fans moves 105 ft'Vmin of air at low 
speed and 1:12 fl '/min at high speed. 

In the Model 747i. which has two power supplies and one 
sensor for each supply, each sensor can also sense when the 
Ian associated with one of the power supplies is not operat- 
ing properly. When this happens, the operating fan will be 
sped up, pulling air through the power supply with the de- 
fective fan. This should extend the life of the power supply 
with the defective fan until the controlled process can be 
shut down in a graceful and less disastrous manner or until 
control can be passed to a redundant computer. 

Brick Strategy 

The wide range of measurement and industrial automation 
customer needs could not be met with just one product. 
Therefore, we had to develop a strategy to offer a high de- 
gree of flexibility for product features. In an ideal world, the 
best approach to providing different product features would 




Fig. 8. Power supply module 



be to design a family of subassemblies, or bricks, which 
could be mixed ami matched in many different configura- 
tions. Each brick would adhere to Standard size constraints 
such as width, depth, incremental height units, and electri- 
cal interconnect standards. Conceptually the OEM customer 
would be able to select the number, typo, and mix of I/O slots, 
the number and type of graphics display interfaces, the num- 
ber and type of mass storage devices, and the number and 
type Of GPU options. 

For the Model 74 5 j and 747i workstations, the width of the 
standard brick was driven by the width of two EISA cards 
laid side by side. The maximum depth of a brick was driven 
by the length of an EISA card. The standard brick height 
increment concept was abandoned to allow the products to 
lit into a smaller package while adhering to EIA standard 
racktnount increments. The electrical interconnect standard 
was also abandoned because of physical connector space, 
connector cost, and high insertion forces. Flexibility for 
Inline upgrades was traded off for greater serviceability and 
lower cost. 

Industrial and measurement automation customers rarely 
upgrade a system after it is installed. Therefore, rather than 
designing a standard package with optional expanders that 
carry the addeil cost of box-to-box interconnect and make 
the removal of the backplane in the rack impossible, an ap- 
proach of using standard bricks housed in a variety of differ- 
ent sized chassis was implemented. Each brick has the same 
backplane. 

The Model 74~>i uses a -11 ' (four EIA instrument rack units) 
high box and holds a CPU brick, a four-slot EISA brick, a 
mass storage brick, and a power supply brick (see Fig. la). 
The Model 747i uses a 71' package which holds a CPU brick, 
a two-slot EISA brick, an SGC (standard graphic connect ) 
brick, a six-slot YMEbus brick, and two power supplies ( see 
Fig. Ib). The boxes contain two internal walls that support 
the card guides, and a structure to support the bricks. These 
walls can be separated from the chassis, making it possible 
to design other versions of walls quickly. This feature allows 
different versions of the industrial workstations to be de- 
signed for OEM customers. The versatility offered by the 
walls allows a shorter time to market for future products and 
reduces the development cost of redesigning an entire pack- 
age. The backplane, which provides power and bus signals 
between bricks, is Unique for each product developed. 

CPU Brick. The IIP PA-RISC processor delivers more than 
enough processing for the vast majority of customers in the 
industrial and measurement markets. However, customers 
do want IIP PA-RISC machines for the expected support life. 
Standard K5M bytes of SIMM ECC (error coned ion code) 
Iv'AM with optional configurations up to 12KM bytes is sup- 
ported. The core I/O includes HI'-HIL. parallel, two serial 
ports, audio in and out. SCSI, AI'I (access unit interface) 
LA.N, HP-IB. and onboard 1280-by- It 124-pixel graphics mem- 
ory. The CPU brick is housed in an aluminum extruded 
frame to provide additional mechanical hoard support dur- 
ing insertion, to protect surface mount Components on the 
underside when outside the product, and to offer a nigged 
industrial appearance and feel. Fig. 2 shows the CPC brick. 
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Fig. 9. EISA card brick with four slots. 

EISA Brick. To save space in the product the EISA I/O cards 
are oriented horizontally (see Fig. 9). The structure that Sup- 
ports the cards along with the converter circuits is easily 
removed for service or upgrades. Easy access to EISA I/O 
cards is a feature that adds to the competitiveness of our 
workstations in the industrial marketplace. Almost all of the 
PCs used in the industrial marketplace require the user to 
remove the workstation cabinet from the rack and then 
open a clamshell case to service or upgrade I/O cards. 

Mass Storage Brick. The removable tray that holds the mass 
storage devices is structurally reinforced so that the me- 
chanical vibration frequency response is high. The tray is 
firmly supported at one end by three tight-toleianced pins 
and at the other by two captive threaded fasteners. This 
solid foundation approach required no additional vibration 
mounts beyond those designed into the individual mass stor- 
age devices by tJte manufacturer. This approach not only is 
lower in cost for the majority of customers, but provides a 
significantly more rugged system. However, customers with 
systems that are vehicle mounted will require very soft 
vibration isolators and thus a larger shock zone around the 
disk, both of which lead to higher costs and a physically 
larger product Fig. 7 shows a mass storage brick. 

The individual mass storage devices are held in place by 
carriers thai were leveraged from the high-volume MP 9000 
Model 425e workstation. These carriers, which are shown in 
Fig. 7, can be oriented towards either the cable end or the 
noncable end of the tray by means of interlock details 
located in different places on (he tray. 

The SCSI interface to the mass storage devices is provided 
by an external shielded cable which comes from a filtered 
connector on the CPU brick. This approach was leveraged 
from the design used in the HP 9000 Models 720 and 730. 
Besides providing excellent EMI and ESI) performance, this 
design allows Hie user to connect to an external mass storage 
device rather than the devices on the mass storage tray. This 
capability is useful for diagnostics. 

VMEbus Brick. The VMEbus brick, shown in Fig, tOj provides 
six VMEbus slots. The entire brick, which includes the VME- 
bus carclcage. backplane, and translation circuit, is remov- 
able as one piece. Customers are delighted to have the abil- 
ity to remove the VMEbus brick and take it to a lab bench to 
work on. With the brick removed, access to the P2 connec- 
torl is convenient. A cable passage slot allows easy passage 



Fig. 1 0. \"MEbus brick with six VMEbus slots. The first two slots are 
occupied by a two-slot VMEbus module. 

of ribbon cable from the rear of the backplane to inside the 
cardcage. 

The cover shown in Fig. 10 is required to provide RFI regula- 
tory compliance. The customer can modify the cover to add 
the desired bulkhead-style connector hole patterns and pro- 
vide cables with service loops as required for each different 
configuration. Most customers elect to eliminate this part 
when it is not required. 

SGC Brick. Standard graphic connect, or SGC, allows access 
to HP graphics and is a standard feature of HP 9000 Series 
700 workstations. 

Power Supply 

The power supply delivers up to 300 watts. Once the power 
supply is removed, the 120-mm fan housed inside the power 
supply is accessible by removing only two screws. A floating 
connector system prevents damage from mechanical shock. 
The power supply is wrapped in metal Besides protecting 
the user from electrical shock, this reduces EMI between 
the power supply and the CPU or other EMI-sensit ive bricks. 
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Online CO2 Laser Beam Real-Time 
Control Algorithm for Orthopedic 
Surgical Applications 

New data obtained from treating polymethylmethacrylate (PMMA) with a 
nonmoving, CW, 10-watt, CO2 laser beam is presented. Guidelines based 
on this data can be used during precision laser surgery in orthopedics to 
avoid unnecessary mechanical and thermal trauma to healthy bone tissue. 
A computerized algorithm incorporating these guidelines can be imple- 
mented on an HP 9000 workstation connected to a central database for 
multiple-operating-room data collection, online consultation, and analysis. 

by Franco A. Canestri 



The work described in this article was done to confirm in 
greater detaii the conclusions published in IDS;! 1 on treating 
polymethylmethacrylate (PMMA) with a nonmoving. CW. 
10-watl ('( )-2 laser beam and to investigate any possible addi- 
tional relationship among the ablated methacrylate volume, 
the surface crater radius R(t,.). and its depth Z(l,.). where I,, 
is the beam exposure lime in seconds. Because of the very 
close thermodynamic similarity between I'MMA and bone 
tissue (see Table I), these results may be valuable in ortho- 
pedic surgery, where the procedures of cutting bone and 
removal of bone cement (a methacrylate polymer) are well- 
known sources of complications. Carbon dioxide lasers 
have been used in continuous and pulsed modes in both 
cases, but bone carbonization, thermal Injury, and debris 
result very frequently in inflammatory response with a re- 
larded raic of bone healing. Therefore, a method for cleati 
removal of bone cement and precise osteotomy without 
mechanical and thermal trauma would have distinct 
advantages over existing techniques. 

In this article, equations for K(t,.| and Z(l,.) for each focal 
length are presented. A very interesting relation was identi- 
fied bet ween the ablated volume for a given focal length and 
the values of R and Z integrated bet ween t,. = 0 and t,. = 2 
seconds. The most important result is confirmation of the 
very close relationship between the areas under the K ;unl Z 
curves and the volume. With a simple equation (equation 3, 
discussed later), it is possible to compare the characteristics 
of craters obtained with moving and nonmoving laser beams 
at different operative conditions between (I and 2 seconds, a 
lime interval that covers the majority of combinations of 
output powers, scanning speeds, and focal lengths reported 
in the literature/ 1 1 1 The close thermodynamic similarities 
between PMMA and compact bone tissue have been demon- 
strated, except for the water content (Table I. bottom), 
which strongly influences CO* laser beam absorption. 

Portions ot this article were originally published m Ihe International Journal ol Clinical 
Monitoring ami Computing ' 1 6 Copyright 1992 Kluwer Academic Publishers Reprinted with 
permission 



Therefore, a correction factor must be applied to the main 
equation lo calculate the ablated volume In bone tissue. 



Table I 

PMMA versus Bone Thermodynamic Parameters 
in the Near to Mid-Infrared Wavelength Laser Beam Region 
(800 nm to 10.6 urn) 

PMMA 



Density 



vera 
Specific Heal 
.1 



19 



1.38 |7] 



g °C 

Thermal Conductivity 0.17 |7| 
x 10" 

1.06 1 1 1 



\s • cm ■ C 
Thermal Diffushity 



9.6 [7| 



Fluence Ablation 
Threshold 

_J_ 
vcm" 

Latent Meat of Ablation 3.85 [7] 
cm 



Ablation Energy 

i x to 8 



Water Content (%) 



3.5 171 



0.3 |1»| 
immersed 
24h & 23 C 



Bone Tissue 

0.8 to 1.3 [5] 

1.3 to 23.1 [5] 

0.16 to 0.34 1 5] 
1.0 to 2.2 [51 



2.1 to 3.4 [14| 
8.0 to 18.0 [101 



3.7 to 13.0 111 | 
3.0 to 14.0 [3] 
10.0 1 14| 



the numbers in square brackets indicate releiences listed on page 12 
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15 3 4 5 6 75 9 10.5 

2.5 5 7 5 10 12 5 15 175 

3 5 7 105 14 17 5 21 245 

4 5 9 13 5 18 22 5 27 31 5 



Fig. I. Focal sequence niainx 
Equipment and Symbols 

As described in reference 1. our grouji al ihe National Cancer 
Institute of Milan obtained results for laser wavelengths of 
Xj = 2.5 in. '/.-i = 5 in. X;j = 7.5 in, and Ag, = 400 mm = 15.75 in. 
using a commercial Valfivre C0g laser with a nominal outpui 
of 10 watts on the beam spot. The transverse beam mode 
was TEM|i- and the foc using head was kepi steady over 
well-polished cubes of ester niethacrylale (Vedril C from 
Montedison) measuring .") by :i by 2 cm. The exposure inter- 
vals of Ihe nomnoving C'W CQa laser beam were set to 0.4, 
0.7, 1, 1.3, 1.6, and L8 seconds. W A nitrogen flow helped 
remove powder and steam during irradiation. Knowing that 
X* = 2 A[, k;i = 3 >.], and X,;, = 6.3 Xi, a working matrix U/ (Fig. 
1) was defined in which the elements of each row represent 
focal lengths nX|„ where n = 1, 2, 3, 4,... and X|, is the basic- 
focal length, varying between 1 inch and an arbitrary maxi- 
mum in steps of 0.5 inch (first column). The matrix *V 
represents a comprehensive set of commonly used focal 
lengths 1 ""'" Structured to allow quick access to the opera- 
tional data on a given focal length. Each row of *P defines 
the concept of a focal sequence FS|, of a given basic focal 
length X b . 

Results 

All of the existing experimental trials performed using a CW 
( '< )j laser beam with exposure times ranging between 0.4 
8tld 2 seconds on PMMA samples show clearly Ihe strong 
focal-length-relaled ablative beam effects. 115 " 18 The data 
points K(t,.) and Z(l,.) measured in this study can be ex- 
pressed for t,. between 0 and 2 seconds by the equations 
shown in Fig. 2. The following empirical equation can fore- 
cast Ihe ablated volumes in PMMA for focal lengths of 2.5 in. 
5 in, 7.5 in, and 15.75 in (100 mm): 



V(.,.,X k .FS 1) ) = L|X,„X k )-C(t,.,X k )-V 1) (X„) 
L(X b ,X k ) 



(1) 



C(t,..X k ) = 



f 



Z(l)dt 



R(l)dl 



/. = K 



In this equation, V(t,„ X k .FS|>) is the ablated I'MMA volume 
alter I,, seconds of X k -focused laser beam irradiation. V'| ( (X|,) 
is constant for each FS|,. n is an integer multiple of X|, = j + 
0.5 in, where j = 0.5, 1.2.3.... 

Recent investigations have. shown that equation 1 can be 
Written in a more analytical form for exposure times I,, of 0.4 
and 2 seconds as follows: 



For t e = 0.4 s, 

ViX k | = (0.1068 + 0.5581 X k - 0.0296 iff 



For I,. = 2 s, 



VIXk! = exp 



(2) 



-57.564 + 49.307 X k | 
.1 + 14.741 X k + 0.251 X k J 



Fig. 3 compares experimental data for these two exposure 
times with plots of equations 2. For V = 0.4 s, r 1 = 0.996 and 
for t P = 2 s. r 2 = 0.987. where r 2 is a measure of how well a 
given analytical Curve fits the experimental data (r 1 = 1 for a 
perfect match). These equations can also lie used to study 



i. = 2.5 in = 63 5 mm 



I = 35 - 



21.2 



,0 2 767 

Z = 12 - l" 4306 
AZ = ± 0 34 

2R = 0.258 + 0.642 t 0 , 0888 
2R = 0.9 
\2R = ± 02 



0.55 <I,S! 
0 < 1. < 0.55 



0< t e < 1 
1.2 1 



/. = 5 in = 127 mm 



Z = 31.5 - 



22.5 



,04475 
Z = 11 I 0 , 97 " 
AZ = ± 0.27 

2R = 0.516 + 0.784 |J' 5 ' 9 
2R = 1J 
A2R = ± 0.42 



la Si 

o< i, s i 

0.2 <l t sl 

1.2 1 



X = 7.5 in = 190.5 mm 

Z = 14.13 ln(1 + 1.) 0 < l e s 1.5 

i = 18 _ mu 

|J 'M' t, * 1.5 

\Z = ±0.3 

2R = 0.774 + 0 706 l°, ,4M 0 < l e < 1 
2R = 148 , e > 1 

\2R = ± 0 38 
X = 15.75 in = 400 mm 
Z = 2 5 l E 
\Z = ±0.11 

2R = 1.625 + 0.575 1° ;39 ' 
2H = 2 2 
\2R ■ ± 0.4 



0 < !«, < 2 

0< l e <l 
1.2 1 



Fig. 2. Experimental best-fit equations for Htt,.i and x.n, < tor non- 
mnving, lU-wntt. (AV laser beams at focal IcnKtlis uf 2.5, 5, 7.5, and 
l"i 7f> inches. It and '/. arc m mm. I,, is in seconds. The transverse 

beam mode wasTEMn« 
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5 10 15 

Focal Length (inches) 



Fitf. 3. Rest-fil curves oruven«e ablated volumes in I'M MA lor a 
10-watt, noninoviiig, C,\\ : , TEMji« laser beam 

the effects of changing focal lengths, exposure limes, and 
ablated volumes. It is important to notice lhat there is a 
maximum ablated volume for each exposure time t e , and 
that increasing the focal length does nol correspond to a 
linear increase of the ablated volume. 

LCA Algorithm: Preliminary Investigation and Proposal 

Since PMMA 71 " and compaci bone tissue have similar ther- 
modynamic characteristics except for their waler conienl, the 
proposed equations can be used in a closed-loop compufer- 
assisted algorithm for orthopedic surgical applications. The 
algorithm is named LCA after the two parameters L and C in 
equation L 

The implementation of litis algorithm on a IIP 9000 HP-UX* 
workstation would provide the surgeon with an additional 
safely lool to reduce the risks of bone injury during laser 
irradiation, which often results in inflammatory response 
with a retarded rate of bone healing. I15 l:i This happens 
quite frequently during general orthopedic surgery, especially 
because of incorrect settings of laser beam focal lengths 
and/or exposure limes. Removal of bone cement (a PMMA- 
based polymer) that is in close contact with healthy native 
bone is I lie most critical operation in terms of potential bone 
damage." 

Operation of the LCA algorithm is as follows (see Fig. 4). 
The surgeon specifies the required crater diameter 2K and 
depth Z and the maximum tolerances A2R and AZ. and 
chooses parameters K\(, t,„ W, v that are likely to produce the 
desired ablation. (W is the output power of the laser and v is 
the scanning speed of the laser beam. ) The computer pro- 
gram checks whether Xj, in FS|, is also included in FS2.5. The 
focal sequence FS2.5 is known experimentally and is there- 
fore always used as the primary reference. 

In parallel, the maximum ablation volume Y mas is calculated 
and stored as described in reference 15, using the specified 
values of R and Z. The values of V'i„ L, and C are calculated 
using equation I, [f Xfe is not an element of F&>-,, the algo- 
rithm interpolates between the two closest focal lengths 
belonging to FS-> 5. 

Equation 1 has to be corrected for a laser beam that is 
moving with respect to the Operating table and to take into 
consideration the different COg laser beam absorption mo- 
dalities of PMMA and bone tissue because of their different 



Surgeon 





Choose 

A k ,< c ,W,V P 








► 


Identify FS b in <P 
Containing >.i 




Fig. 4. Flowchart nrtlic LCA algorithm. 

water content. A nonmoving laser beam has the same cut- 
ting capabilities of a moving beam if the former has an 
equivalent exposure time (t lH|v ) given by the equation: 2 



W W,,, v 2v ' 
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where R* is the surface radius of the beam spot, v is the 
scanning speed of the laser beam. W is the output power of 
the moving laser, and is the output power of the non- 
moving laser. 

In the case of a moving laser beam (V9*0), equation '.i is used 
to determine i,,^ The crater diameter 2R and depth Z are 
then calculated using die equations in Fig. 2. Their v alues 
are compared with the specified values using the tolerances 
A2R and AZ supplied by the surgeon as input data. The two 
calculated values are corrected by adjusting the exposure 
time t e until they are within the specified tolerances. Finally, 
the data 2R, Z, t e and V( t e . ^.FS,,) are proposed for valida- 
tion after an additional safety check between the volume 
Vmnx and Yd,,. X^FSj,). This last step is necessary to prevent 
the ablation volume from exceeding the value V max calcu- 
lated at the beginning, which is a "not-to-exceed" ablation 
volume. This can happen if the wrong \ k and t e are selected 
at the beginning of the LCA simulation. 

In c ase of a dangerous situation, a warning message appears 
and a new focal length Is suggested even if it belongs to a 
different FS,, in "P. At the end of the simulation, a compre- 
hensive final report is printed out for the surgeon's conve- 
nience. In parallel, a central data base is automatically 
updated for later review. Reports and statistics can be re- 
quested either online for direct support in a specific case 
that needs more attention or later for leaching and research 
activities. Video images stored during the actual operation 
can also be recalled, printed, and attached to t lie report for 
the complete documentation of each case. 

Fort.,. = 0.4 s and t e = 2 s, equation 2 is used instead of 
equation 1. This allows a faster determinal ion of the final 
total ablated volume for a given i\|<. 



System Design 

By implementing a workstation-based design, each operating 
room can be equipped with a COj laser mainframe which 
can be interfaced to an HP-UX workstation able to perform 
several tasks simultaneously in real time (Fig. 5). For exam- 
ple, one task is the general supervision of the laser beam 
following the guidelines proposed, analyzed, and validated 
through the LCA algorithm. This can be achieved by using a 
laser control interface for dynamic adjusiinen! of the laser's 
output parameters and by a multiplexer which physically 
checks that the laser performs as requested. This is done b\ 
using an optical device connected to the laser output focus- 
ing head, which is also responsible for changing the laser's 
focal length and the be;un mode. 

A second important lask is network communication among 
several similarly equipped Operating rooms. Bach indepen- 
dent node can send LCA simulations, intraoperative data, 
video sequences, and ether results directly to a main data- 
base over a multiple-user local area network. The network 
also allows mutual point-to-point communication so that 
operating room X can exchange dala with operating room Y 
for consultation. The database and the HP-I X operating 
system are resident on a network server. The LCA applica- 
tion software together with the related check routines is 
loaded on each operating room's workstation, which is phys- 
ically installed in a reserved area close to the operating 
room but not in the patient's vicinity. 

This method can increase the productivity of the operating 
room suite of a hospital. It also offers the possibility of build- 
ing a reference center for laser applications in surgery, using 
a network concept that can be extended to other institutions. 
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Conclusions 

The LCA method suggests a global but detailed set of guide- 

lines lo be followed during orthopedic surgery using a contin- 
uous wave ('( >■_■ laser beam at different operating conditions. 
Critical cases can be simulated on PMMA samples first and 
then transferred to bone tissue. It has also been shown how 
to transfer preliminary test results from PMMA to bone sam- 
ples for moving or nonmoving ( 'W laser beams. A computer- 
ized system can store and control in real time the operative 
procedures and a convenient database can be built for later 
consultation. Additional Investigation is needed lo test the 
validity of this method over a large variety of hard tissues 
;md during the use of pulsed and superpulsed laser beams. 
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Online Defect Management via a 
Client/Server Relational Database 
Management System 

The ability to provide timely access to large volumes of data, ensure data 
and process integrity, and share defect data among related projects are 
the main features provided in this new defect management system. 

by Brian E. Hoffmann. David A. Keefer, and Douglas K. Howell 



The defect management system, or DMS, described in this 
aiticle is an online transaction processing system for manag- 
ing defects found during software and firmware develop- 
ment and lest. It was developed to enable HP's Boise Printer 
and Network Printer Divisions to manage shared defects in 
leveraged and concurrent products and to increase data 
integrity and reduce overall defect processing time. The 
DMS application is based on an off-the-shelf relational data- 
base management system, which employs a client-server 
architecture running on an HP 9000 workstation. The devel- 
opment team employed an evolutionary delivery process to 
ensure that the system met user needs and used proprietary 
4GL ( fourth-generation language) programming tools to 
maximize productivity. This paper summarizes the rationale 
for building DMS, details its implementation and design, and 
evaluates the system and its development process. 

Background 

Since the introduction of the fits! HP LaserJet printer in 1984, 
increasing customer demand for LaserJet products has kept 
HP's printer divisions on a steady growth c urve for years. 
Market demand for new products with increased capability 
has continually challenged the R&D and quality assurance 
organizations to scale up their activities, while improving 
overall product quality and reliability. Furthermore, compet- 
itive pressures for increased frequency of product introduc- 
tions with shorter development times have challenged devel- 
opment teams to drive out process inefficienc ies so they can 
develop more complex products in less time. 

One of the responsibilities of the software quality organiza- 
tion is to provide extensive defect tracking and software 
process measurement services which enable R&D manage- 
ment to gauge software quality and product schedule accu- 
racy. This also entails maintaining all historical defect data 
on l-ascr.Jet and related products, which provides manage- 
ment information about historical product quality and past 
and present project schedule trends. 

As R&D activities continued to expand, we found that our 
ability to support the existing defect tracking system became 
limited. Foreseeing an inability to manage defeCi informa- 
tion and software metrics at this scale with existing tools, 
we set out to develop a defect tracking system that could 



operate Under these demands as well as tackle some of the 
more difficult defect tracking challenges. 

Existing Problems 

Many features required for the divisions' defect tracking 
process were not supported by the old defect tracking soft- 
ware. Over time our process had ev olved into a largely 
manual system with limited electronic assistance. Fig, 1 
gives an overview of the key elements of our old defect 
tracking system. 

Three physical elements were required for defect submittal: 
a paper submit form, defective hardcopy ( if applicable), and 
source files (if applicable). Since the existing (pre-DMS) 
defect tracking software was unable to translate some of 
these elements into an acceptable electronic form, manual 
translation and filing processes became necessary. This mix- 
ture of human and electronic processes created problems in 
the following areas: 

• Volume sensitivity 

• Tracking defects through concurrent projects and code 
leverages 

• Data and process integrity 

• Timeliness. 

Volume Sensitivity. Among all the problems with the previous 
defect tracking system, volume sensitivity was the most; 
notable. Because of the serial nature of the old process and 
its requirement for extensive human assistance to move 
defects through the system, bottlenecks would occur under 
any serious load. Many steps required manual intervention 
by engineers and administrative assistants to drive a defect 
through its complete cycle. As a result, the labor demands 
imposed by the defect tracking system became a tremen- 
dous burden as the number of defects submitted by projects 
increased. 

Concurrent Projects and Code Leverages. Nearly all the IWH 
projects that tracked defects were code leverage efforts 
rather than new code development efforts, hi addition, many 
leverages of similar code were occurring simultaneously 
among projects at multiple sites. However, no utility or pro- 
cess in the defect tracking system dealt directly with the 
problem of tracking delects in leveraged code. The problem 
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Source Files 




was particularly noticeable ai the beginning of a leveraged 
project when R&l) engineers were required lo read entire 
databases of defects to identify unresolved problems in in- 
herited code. In addition, no formal notification mechanism 
existed that would notify R&D engineers, for example, that 
the code they inherited last week received a new defect 
today. 

Data and Process Integrity. An unfortunate and costly by- 
product of a system without data and process integrity 
checks is data corruption and unknown data states. Our old 
defect tracking system was no exception to this rule. Given 
the relatively open flat-file data structures and often unreli- 
able e-mail-based transaction schemes of this system, we 
often scrambled lo recover or reconstruct a lost or broken 
defect record — an activity thai often consumed all the time 
of the defect tracking system administrator. 

Timeliness. A final weakness of the old defect I racking system 
was its inability to provide timely access to accurate defect 
information and project metrics. Since portions of the pro- 
cess were distributed among various people anil lools. instan- 
taneous information was not always available. Even simple 
requests for defect information might require assistance from 
the defect Hacking administrator or specialized lools. This 
serial process and its patchwork of components effective^ 
inhibited the free flow of defect information to R&D. 

DMS Features 

Implementation Guidelines 

To maintain focus during the implementation of DMS. the 
following guidelines were established to assess whether 
DMS would achieve its design objectives: 
DM.S must seamlessly and automatically encapsulate our 
old defect process model which has proven itself in the past. 
DMS must rely on a client-server architecture to deliver its 
capabilities via a network to as much of the development 



Fig. 1. The original due l IMS) 
rJefei i tracking system. 

community as possible, while centrally maintaining and 
ensuring 24-hour, seven-day continuous operation. 
• Given the rate at which R&D and quality assurance pro- 
cesses are adapting to keep pace with market demands. 
DMS must be able to adapt and embrace additional process 
refinements as they evolve. 

DMS Process Encapsulation 

On the surface, DMS is an online database application thai 
offers engineers and managers full electronic access to defect 
informat ion. DMS is much more than just a data collection 
and reporting tool, it is also an electronic mechanism thai 
supports the defect tracking process that we have proven 
and refined over time. 

DMS functionality is divided into six core and six auxiliary 
functions (see Fig. 2). The core functions were identified as 
the minima] set required for an operational system. Auxiliary 
functions were added incrementally in subsequent releases 
of the tool. The core functions are Submit, Receive, Resolve, 
Modify/Delete, Update, and Verily. The auxiliaiy fund ions are 
Screen, Screen Resolve, Screen Update, Unreceive. Unresolve, and 
Unverily. With the exception of Update, each of these functions 
causes a defect to move from one state to another. The 
states, which are represented by rectangles in Fig. 2, are Un- 
screened, Rejected Unscreened. Unreceived. Open. Unscreened Resolve. 
Unverified Resolve, and Verified Resolve. Each state represents the 
status of a defect record in the DMS database. 

DMS functions are accessed by the user from the menu 
items presented by the initial DMS screen (see Fig. 8). The 
Submit, Modify/Delete, and Receive functions are accessed 
through the Submit menu item, the Resolve and Screen Resolve 
functions are accessed through the Resolve menu item, and 
the Verify function is accessed through the Verify menu item. 
The auxiliary' functions are accessed through the Update 
menu item. Users navigate DMS forms either through the 
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Three roles are played by DMS users, and each role has a 
differenl permission level. The roles are user (lowest per- 
mission level), screener, and manager (highest permission 
level). Users typically perform the Submit. Receive, Resolve, and 
Verify functions. Screeners typically perform Ihe Screen and 
Screen Resolve (tactions. The manager permission level is 
reserved for individuals who ate responsible for adding ami 
configuring projects in DMS. 

Submit Function. I bis is where defect information is initially 
entered into the DMS database, resulting in the defect being 
placed in Ihe Unscreened slate ( <■ in Kig. 2). The user is re- 
quired to enter a minimal set of information relating to the 
defect. The user also has Ihe opportunity to add optional 
information at this time. Submitters have Ihe ability to at- 
tach both text and ob.jecl files to defects. These files may be 
of particular use to engineers attempting to reproduce and 
repair defects. Once a defer l is submitted, the user can 
continue lo add additional information via the Modify/Delete 
function i' until the defect is screened. 

Screen Function. This function is typically performed by a 
person associated With Ihe development team who has inti- 
mate knowledge of Ihe product or lesl process. The Screen 
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function is the point ai which Ihe defect information is ex- 
amined for completeness and correctness, and the defect 
severity is added to the defect record. If insufficient infor- 
mation is provided by Ihe Submitter then the screener may 
reject the defect, placing it in the Rejected Unscreened state ■ . 
The acl Of rejecting a defect causes electronic mail to be sent 
to notify Ihe submitter that the defect was not accepted. The 
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Fig. 4. One Of four pages uf information presented in I lie user 
performing a Screen Resolve fund inn 

submitter then has the option of deleting the delect or modi- 
fying and returning the delect to the Unscreened stale via the 
Modify/Delete function b . It is important to note that this is 
the only point in the DMS process where a defect can be 
removed from the database. Once a defect lias been screened 
it is assigned an identifying number and made part of the 
permanent project database upon its entry into the Unreceived 
stale "1 . 

Receive Function. A delect is routed to the person who will 
most likely be responsible for defect repair via the Receive 
function. The name of the responsible engineer is obtained 
from a daUtbase list of all possible names for a given project. 
When a defect is received, the defect is moved into the Open 
stale and an electronic mail notice containing the defect 
number and brief details about the deled is sent to the 
person named as the responsible engineer •> . 

Resolve Function. When a defect is repaired, the Resolve func- 
tion is used to add fix information to the defect record and 
promote the defect to the Unscreened Resolve stale < . This 
function is usually performed by the engineer who enters 
the resolution code, the description of the resolution, and 
any relevant files. Depending on the resolution code, other 
information may be required as well. For example, if the 
defect is resolved as a CC (code change) then the user is 
required to add the name of at least one module that was 
changed. 

Screen Resolve Function. This function allows the project 
screener to scan the resolved defect to make sure that the 
resolution information is as complete and correct as pos- 
sible. No additional information is added to the defect rec- 
ord by this function. Screen Resolve allows each project 
screener to make sure that the resolution information meets 
the slandards set by each project team. If the resolution is 
rejected by the screener. the defed is returned to the Open 
state, and electronic mail is sent to the responsible engineer 
stating that the resolution has been rejected * . If the re-solu- 
tion is accepted by the screener, the defed is promoted lo 
the Unverified Resolve state and the submitter is sent electronic 
mail stating that the defect is ready for verification h , 

Fig. 4 shows one of four pages of information presented to 
the user performing the Screen Resolve function. The user can 
switch between pages with the View menu item. The File 
menu item is used to move files between the HI'-l'X* file 



system and the defect record. The Update menu item allows 
users access to the Update function. The bottom two lines of 
the form show that the defect is shared between two proj- 
ects. The defect is in an Unscreened Resolve slate for project 
TrainingJ (pr = Unscreened Resolve) and in an Unverified Resolve 
(r = resolve) state for project Training_2. 

Verify Function. The submitter uses the Verify function to deter 
mine if the defect is fixed. If the resolution is acceptable to 
the submitter, a verification code is added to the defect and 
it is promoted to the Verified Resolve state ( i in Fig. 2). If the 
submitter decides that the defect is not repaired then the 
project screener is notified. The screener has Ihe capability 
to return the defect to the Open state via the Unresolve function. 

Undo Functions. Screeneis also have the ability to move 
defects from Ihe Verified Resolve state lo the Unverified Resolve 
stale via the Unverify function and from the Open stale to the 
Unreceived state via Ihe Unreceive function. When a defect is 
moved lo a previous state all information that was added by 
the previous function is lost. For example, when a defect is 
unresolved Ihe informal ion added by the Resolve function is 
lost. 

Update Function. When a defect is in the Unreceived state and 
beyond, changes are made to the defect via the Update func- 
tion. Changes made to defects by this function are either 
applied directly to Ihe defect record or placed in an update 
queue based on the following criteria: 

• If Ihe person perfonning the update is a screener or is listed 
as a responsible engineer for the project that owns the defect 
then the update is applied to the defect record. 

• If Ihe person performing the update does not meet the above 
criteria then the modified defect record is placed in the up- 
date queue where a screener mast approve the modifications 
before they are applied to the defect database. 

There is also a set of configurable rules that may force an 
updale into Ihe update queue. If Ihe screener rejects the 
update, electronic mail is sent to Ihe originator of the update 
about the rejection ( > in Fig. 2). 

There is an additional process step not shown in Fig. 2. It was 
pointed out lo the DMS developers that in the early stages of 
a project, engineers frequently find and fix a great many 
defects in a very short period of lime. The engineers found it 
very time-consuming to submit a defect and wait for another 
individual (perhaps two) lo screen and receive a defect so 
that it could be resolved. In litis case the DMS process was 
seen as a deterrent to collecting complete defect history. The 
process model was modified in the third release of DMS to 
allow the responsible engineer to move a deled from the 
Unscreened state to the Unscreened Resolve state via the Resolve 
Unscreened function. This function can be performed only if 
Ihe submitter and resolver are Ihe same person, and the 
resolver is the responsible engineer for the project that 
owns the defect. 

Users can readily determine the distribution of defects for a 
project through the project snapshot screen (see Fig. 5). 
Accessed from the Report menu item shown in Fig. 3, the 
snapshot shows the number of defects in each DMS state 
and the bugweight. The bugweight metric is calculated as 
the sum of severities squared for all defects in the Open and 
Unreceived states. 
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Fig. 5. Project snapshot screen. 

DMS fully encapsulates I his six -step process and subjects all 
incoming defect information to rigorous process checks. 
Users can rely on the knowledge that defects are in predict- 
able states and benefit from the valuable process metrics 
that are derived by measuring transitions from state lo stale. 

Client7Server Architecture 

Another DMS characteristic that provides the foundation for 
many DMS services is the fact that it is built on a commer- 
cially available client/server RDBMS (relational database 
management system). Among the many benefits of a client/ 
server database architecture are distributed processing, 
heterogeneous operating environments, and elimination of 
many configuration management problems. An additional 
database feature that enables DMS to guarantee data and 
process integrity is transaction management Since DMS 
uses the < )I,'1T (online transaction processing) capabilities 
of the underlying database, all operations that modify data 
in DMS are nested in real-time transactions. Modification 
requests t bat fail ;ls a result of invalid data or hardware fail- 
ure, for example, do not corrupt existing data. Had transac- 
tions are automatically rolled back to known previous 
slates. See "Client/Server Database Architecture." on page 78. 

Defect Sharing. One of DMS's most important implementa- 
tion details is the relational structure of a defect record. By 
making use of traditional relational design guidelines, defect 
record implementation in DMS enables information for one 
project to be shared easily with a record from another proj- 
ect. As Fig. (i indicates, the pre-DMS implementation of a 
defect record consisted of project-dependent submit and 
resolve information, which could not be easily shared 
between projects. 

In DMS. relational structuring lias been used to separate 
submit information from project-specific resolve data. One 
submit record can be shared among many different projects, 
with each project having a potentially unique resolution re- 
cord. This model is shown in Fig. 7. The major benefits of 
this structure are twofold. First, all projects charged with 
the same defect automatically share project independent 
Sbbmit information. Second, the structure provides an auto- 
matic communication path for project dependent resolve 
in Ion nation lo all the projects that share the defect. Thus, 
each project can quickly obtain another project's status 
information for a shared defect. 



Project A 



ProieclB 




Fig. 6. A pre-DMS defect record consisted of project dependent 
subnul and resolve information tied togedier in one record structure 
Sharing defect information between projects was difficult. 

Flexible Architecture. DMS has been able to derive several 
benefits from its relational implementation. Of these bene- 
fits, structural extensibility is perhaps the most important. 
Given that the system was designed and implemented in an 
evolutionary delivery model, a relational architecture 
proved to be an ideal complement. Existing structures can 
be easily reused and new struct tires can be added without 
massive Structural rewrites. For example, elaborate user 
configurability was not supported in early releases. For sub- 
sequent releases, however, it was a simple matter to add 
user configuration tables to the schema without altering 
existing defect record structures. 

Connectivity. DMS's client/server architecture has also en- 
hanced various aspects of connectivity. The physical separa- 
tion of daia manipulation code (back end) from user interface 
code (front end) maximizes modularity, resulting in more 
readable, less error-prone code. As a side benefit of the sep- 
aration, incremental functionality can be added to the front 
or back end of the code while online without affecting either 
end. From a maintenance perspective, the client/server ar- 
chitecture eliminates many traditional configuration man- 
agement headaches. Software distribution problems, for 
example, are eliminated since client interface code is ex- 
ported to users on the network from one central location via 
NFS. From a network perspective, a heterogeneous client 
environment is fully supportable. Since the server requires 
no knowledge of client type, multiple client platforms have 
equal access to DMS. 

Robust Operation. Transaction management capabilities 
round out the list of major DMS features. ( liven that DMS 
was a migration lo a multiuser online transaction processing 
system, full transaction arbitration became a must. During 

(continued un page 801 
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Client/Server Database Architecture 



Database management systems implemented under client/server computing 
models have eii|oyed increasing popularity in recent years Advanced networking 
capabilities coupled with powerful minicomputer and microcomputer systems 
connected to networks have lavored client/server database architectures over 
more traditional centralized database management systems While the specific 
benefits of implementations vary, client/server databases typically distinguish 
themselves in four general categories 

• Separation of presentation services Irum data manipulation services 

• Scalable high performance 

• Server-enforced integrity and security 

• Heterogeneity and distribution autonomy 

Separation of Presentation and Data Manipulation Services 

Unlike traditional centralized databases, client/server database environments cleanly 
separate presentation (user interface) services from data manipulation services A 
database client performs all application 01 user-specific services necessary to 
convey information to and from a user The data server focuses all services un the 
efficient and secure manipulation ot data conveyed to it from the client Fig 1 
illustrates the labor division in traditional and client/servei architectures 

The net result of this separation is an optimum division of labor data management 
and transaction functions are managed independently from user interface and pre 
sentation functions The benefits of this approach are twofuld First, distribution ul 
client services to each client CPU enables the server to maintain a respectable 
response lime performance advantage over traditional databases Fig 2 shows 
typical response curves for traditional versus client/server databases Since client/ 
server databases are less demanding of operating system overhead, they tend to 
perform belter under load 

The second rnajw benefit of the client/server separation is the leverage of exist 
my CPUs on the nelwuik. Client/server architectures stretch an organization's 
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Fig. 1. Division ot processing labor in traditional versus client/server architectures 
lal Tiaditiunal architecture lb) CTieni/servut architecture 
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Fig. 2. throughput and respunse time performance in traditional and client/server architectures 

overall CPU investment by ensuring that uverall processing loads are properly 
balanced between client and server processing 

Scalable High Performance 

A unique peifurmance trait ul most client/server database architectures is their 
ability to enhance server performance much mure than traditional database archi- 
tectures As seen in Fig 2 response time as a function uf user load tends tu scale 
more linearly under client/server architectures This performance advantage is 
often ruoted in the fullowing design lundamentals; 

• Use uf stured database procedures, which often include control How extensions to 
the data manipulation language 

• Use of remote proceduie calls (RPCsl for server-to-server communication 

t Implementation of the server as a single multithreaded process in the operating 
system 

Stored procedures are the hallmark of client/server databases Typically, they exist 
as specialized database executables These executables are constructed by com 
piling source statements from the same data manipulation statements |e g , SOL) 
used with the server in an interactive fashion Once a procedure is compiled and 
stored in the database's data dictionary, an application can issue a run time call to 
the procedure. The procedure then executes the same data manipulation or query 
that was defined by the source statements that built the procedure 

Stored procedures ofler many ma|or performance benefits First, network commu- 
nications are dramatically reduced since one procedure call replaces many individ- 
ual data manipulation statements Second, since stored procedures are already 
compiled at run time, performance measurements indicate That they can process 
data manipulation statements five to ten times faster than a sequence ol single 
data manipulation statements Fig 3 illostrates the execution differences between 
stored procedures and ttaditiunal database servers 

Third, stored procedures olten possess the ability to control loyic How in database 
operations. Control constructs such as branching and looping combined with the 
ability tu declare local variables and create temporary database objects, such as 
tables, enable stored procedores to perform complex data manipulation sequences 
Stored procedures can also be nested to invoke a series of database events with a 
single function call 

Another feature high-performance servers possess is the implementation of the 
server as a single operating system process while accommodating multiple simul 
taneous client processes Fig 4 illustrates the architectural differences between 
single-threaded and multithreaded server implementations 

A multithreaded server design trees the database from nearly all of the operating 
system overhead that limits traditional database architectures For example, the 
amount of memory required for a database user connection in a multithreaded 
implementation is around 50K bytes In contrast, traditional single-thieaded serv 
ers can require up to 2M bytes of memory per user connection, operating system 
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Fig. 3. Execution profiles in traditional and client/server architectures 

overhead included Hence, multithreaded server implementations make more 
memory available for disk caching and other applications 

Server-Enforced Integrity and Security 

The client/server approach also has many advantages in preserving the integrity 
nt ininrmation in a database Unlike traditional approaches to maintaining data 
and prncess integrity, business rules and data transaction checks in a client/server 
database environment are exclusively enforced by the server Opportunities for 
data corruption resulting from maintenance efforts are significantly reduced, since 
business rules and transactions only have to be modified in the server instead nf 
in every client application using them 

A specific mechanism often employed by a client/server database for enforcing 
integrity constraints is known as the trigger A trigger is a special type of stored 
prDcedute that is attached to a table and automatically called, or triggered, by an 
attempt to insert, delete, or update data in a table Since triggers reside in the 
server with the database, they are particularly effective as integrity mechanisms 
since they adopt a data- and busmessruledriven approach to integrity, as opposed 
to an application-contrnlled integrity approach Trigger code is written only once, 
instead of many times in multiple client applications An application cannot avoid 
bring a trigger when it attempts to modify data in a table 

Another common use o( triggers is for the maintenance of internal database con 
sistency. or referential integrity For example, duplicate data rows in related tables 
can be prevented by a uniqueness constraint delined in an insert triggei of either 
or both tables to guarantee the one-to-one unique relationship that exists be- 
tween two tables Since client applications cannot be relied on in maintain the 
consistency of a database, inggeis prove to be the ideal mechanism for this task 

Some integrity mechanisms seen in client/server database environments impose 
data constraints on single data fields directly These mechanisms include rules, 
defaults, and user-defined data types A rule is a programmable mechanism lor 
performing conditional data checks such as data range checks or conditional 
comparisons as well as structural checks on data syntax Defaults simply provide 
a user-specified value on inserts in the event that one is not provided with the 
insert statement User-defined data types provide integrity on values that are at 
higher levels of abstraction than numerical types alone provide Some higher level 
user-defined data types might include money, color, or postal code 
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Heterogeneity and Distribution Autonomy 

Client/server databases deliver an open architecture that facilitates portability to 
and management ol the multivendor components within a networked computing 
environment Hardware independence is much more easily achieved in client/ 
server situations since the architecture can cleanly divide client hardware from 
server hardware Furthermore, most commercial vendors of client/server database 
environments offer mechanisms for easily linking different operating platforms 
togelher Thus, existing and new hardware resources have a much higher compati 
bility likehhnod and can therefore be used more efficiently Additionally, hardware 
budgets can be spread further and take full advantage of downsizing to work- 
stations and supermicrocomputer systems 

In addition to supporting hardware independence in networked database environ- 
ments, client/server interfaces also permit open communication in heterogeneous 
software environments The same formalized software interfaces that connect a 
client to a server can be leveraged by other applications or software environ- 
ments Foreign data sources and applications can be seamlessly integrated into a 
client/server database environment 

Although the client/server approach breaks down the traditional barriers that 
prevent data distribution, it simultaneously creates potential for excessive com- 
plexity in distributed database processing As the risk of data corruption increases 
with the complexity of distribution schemes, the old "centralize versus decentralize" 
debate becomes |ustifiably fueled Fortunately, implementation under a client/ 
server approach does not requite an all or nothing apprnach to distribution of the 
database system Developers are tree to choose the level to which the database 
system is to be distributed, thus tetairung a high degree ol autonomy on the issue 
of distribution Furthermore, the client/server approach allows developers to 
evolve the system incrementally toward more or less distribution as required by 
the application In traditional database architectures, the choice is fixed, with 
evolution to more distributed and heterogeneous environments marie virtually 
impossible. 
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heavy operation, (he server reliably conducts more than 50 
simultaneous client sessions. A beneficial side effect of 
transaction management for DMS is that the integrity of all 
process steps can be maintained. Any process change that 
fails will simply be rolled back. Transaction management 
also provides the capability to assign user permissions to 
process transactions. With transaction rules and user roles 
fully defined in DMS, all process rules can be enforced 
through user permissions. 

Physical Overview 

DMS is implemented in two distinct components: a data 
server and a user interface client (see Fig. 8). in addition, 
mail clients and report clients exist that can also interact 
with the server. The data server is a commercially available 
relational database that contains the procedures, triggers, 
rules, and keys that control data manipulation. These ele- 
ments are stored in the server as part of the DMS build 
process. 

Stored Procedures. Stored procedures are database routines 
that are constructed from individual data manipulation state- 
ments. They reside in the data server as database executable 
routines. Similar to function calls in programming languages, 
stored procedures can accept input arguments and return an 
exit status. However, unlike traditional functions, stored pro- 
cedures can return a variable-length stream of data, which 
is organized into rows and columns. Columns typically rep- 
resent data fields, which are placeholders for information in 
a database table. Rows usually represent the data records 
that are stored in ;i database table. Results are typicall.v re- 
turned in ASCII tab-delimited format for presentation or 
postprocessing. 

A stored proc edure may contain one or more batches of SQL 
(Structured Query Language) statements. The SQL state- 
ments make use of input arguments (if any ). perform some 
data manipulation operation, and return a stream of for- 
matted data (if any) to the client application that made the 
call. In addition, stored procedures can make use of con I nil 
flow constructs such as if. ..else, while, and so on to perform 
complex data manipulation tasks. Stored procedures can 
also call other stored procedures performing a series of data 
manipulation operations with a single function call. In DMS 



Fig. 8. Software components <>f 
the dfent/seryej' architecture 

used in DMS. 

stored procedures are used extensively. For example, the 
entire DMS process shown in Fig. 2 is supported as a series 
of stored procedures. 

DMS clients are able to call stored procedures and receive 
data returned from them via a bidirectional communications 
protocol. In DMS. this protocol is fully supported by the 
database manufacturer in the form of a series of libraries 
that link the user interface forms code with the database 
manipulation language (e.g.. SQL). 

Triggers, t )peralions performed by stored procedures that 
modify data may cause the execution of a serv er trigger. A 
trigger is a special type of stored procedure installed on the 
server to ensure the relational integrity of the data in the 
DMS database. Any operation on a table or column that modi- 
fies data will cause a trigger to be executed. Triggers are used 
in DMS to ensure that all operations that modify data are 
carried out consistently throughout the database. For exam- 
ple, a trigger will prevent a user from being deleted from the 
DMS database if that user is referenced in an Open defect. 

Rules. Rules are another database object used in the DMS 
process to enforce integrity constraints that go beyond 
those implied by a particular data type. Rules are applied to 
individual columns of tables to ensure that the values ap- 
plied by insert or update operations conform to a particular 
set or range Of possible values. For example, the table col- 
umn that denotes the stale of a defect is a character data 
type of up to two characters. The rule applied to this column 
ensures that a prospective insert or update will not succeed 
unless the new value corresponds to one of the seven states 
shown in Fig. 2. The following example illustrates this rule: 

create rule status_rule 

as ©status in ("n ", "nr". "u ", "o ", "pr ", "r ". "v ") 

The stales "n" and "nr" refer to the Unscreened and Reacted 
Unscreened states respectively. The remaining states repre- 
sent (in order) Unreceived. Open. Unscreened Resolve. Unverified 
Resolve, and Verified Resolve. 

Tables. The DMS database is constructed of a number of 
tables. The tables serve to gather data items into logically 
related groups, Separate tables exist to contain information 
relating to the submit and resolve portions of defects. Other 
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tables exist lo maintain information about fixed modules, 
attached files, and auxiliary information. There is another 
group of tables that maintain information about projects, 
users, and DMS server sites. These tables are related to one 
another via server keys which serve to map the relationships 
between tables in the database and help ensure relational 
integrity (see Fig. 9). When a stored procedure that returns 
rows from more than one table is executed, keys are used to 
make sure that (he information is joined properly and that 
the data returned does not contain any unwanted rows. 

User Interface Client. The user interface client is constructed 
from a commercially available 4GL forms language and a 
custom C language run-time executive. The forms language 
development environment allows the rapid construction and 
evaluation of groups of atomic user interface objects like 
input fields, pull-down menus, and form decoralion. Stored 
procedure calls are triggered by these atomic objects caus- 
ing the server to generate rows of result data. The forms 
language is designed to manage returned data rows effi- 
ciently with a minimal amount of client coding. The C lan- 
guage run-time executive provides the interface client with 
access to IIP-l'X* commands and custom Q fund ions. Mail 
generated by DMS originates from the client interface 
through the mailxll I command. Files that are atlached lo de- 
fects are loaded into the server via a call to a C function by 
the interface client. 

The user interface client can be compiled lo execute on a 
number of hardware platforms. In addition lo HP 900(1 
Series workstations, users routinely execute the tool from 
networked Macintosh computers running Mac-X and PC- 
compatible computers running X server software or VT100 
terminal emulation. The latter is particularly convenient for 
users who wish lo run the tool from home via modem. Fig. 10 
shows the DMS interface as seen from a VT100 terminal emu- 
lator. The example in Fig. 10 shows the lookup choices given 
to the user for the "how found" field ofthe submit funclion. 
The main difference between this and the XI I presentation 
is that the user must navigate the forms via control keys 
instead of a mouse. 

Utility Clients. Other types of clients, typically report clients, 
can be created using (' language 1 libraries purchased from the 
database manufacturer. Programs generated using these 
libraries differ from other types of host language interfaces 



that rely on embedded SQL. The libraries do not require a 
host language precompiler to process the source code into 
some intermediate form. These libraries have been used to 
generate custom reporting tools which can be executed from 
any suitably configured workstal ion as illustrated below: 

S subnum -file parms -project SUNFLOWER -sort submitter I 
extract i csv_report > reportjile 

The client subnum generates the key values for every defect 
belonging lo project SUNFLOWER which meets the criteria 
contained in file parms ordered by the name ofthe submitter. 
These key values are piped to the client extract which pulls 
information about each defect from the database. This infor- 
mation is then piped to a text processing script and deposited 
in a file. I'sing this scheme, data can be extracted from the 
DMS database and readily formatted for spreadsheet appli- 
cations as well as plain ASCII text reports. Additionally, there 
are a number of third-party software vendors that provide 
products that are designed to interact with DMS provided 
that the PC has network access to the data server. 

Execution Environments. DMS data servers arc typically set Up 
on dedicated hosts. The server al our division server is cur- 
rently installed on an HP 9000 Model 710 computer config- 
ured lo connect with up to 200 simultaneous clients (see 
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Fig. 11). Client code may reside on any suitably configured 
workstation. For convenience and maintainability we have 
placed the user interface client code on a single HP 0000 
Model 380 server which distributes the code toother client 
hosts via NFS. 

Reporting and Searching. Reports can be generated from I IMS 
via the Report menu item from Ihe lop-level form shown in 
Fig. -'5. A variety of canned reports arc available its an ad-hoc 
query mechanism. Fig. 12 shows how a user might generate 
a report of all open defects lor a particular engineer. W hen 
initiated. Ihe interlace prompts the user for the name of a 
responsible engineer from a lisl of responsible engineers 
who have open defects for a given project. I 'sers can readily 
generate reports of open defects ordered by submit number, 
submit date, severity, and others. Fig. 13 shows a summary 
list of open defects ordered by submit number. These reports 
can be printed or saved to a file in a number of formats. 

The ad hoc query mechanism allows customized summary 
reports to be generated based on any combination of selec- 
tion criteria. This utility can be used to produce, for example, 
a list of resolved defects for a given project with a particular 
submit version, resolution code, and fix time greater than 
eighl hours. These customized reports are generated by the 
Search Editor selection shown in Fig. 12. 
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Fig. 12. A screen for generating a list of defects belortging to a 
particular engineer. 



Kin. 1 1. 'I'lic hardware coroptt' 
ncnts of the DMS cJient/server 
architecture 

We use PC-based and HP-l'X-bascd spreadsheet packages to 
produce custom graphic reporls for project management. 
These reports are generated and distributed on a weekly 
basis. The reports can also be generated on a demand basis 
by Ihe Individual project learns. Figs. 14 to 18 show samples 
of some Of these reports. 

Current Slat us 

DMS is currently in its fourth release after I wo years of de- 
velopment. The latest release, version 2.0. contains all of the 
original target functionality and a significant number of user- 
requested enhancements. This latest release contains features 
thai allow online project configuration, User configuration, 
and defect modification. These features, along with other 
new features, reduced the amount of system administration 
lime to expected levels. Enhancements in version 2.0 include 
changes to make similar operations exhibit more consistent 
behavior throughout the tool and changes to the defect man 
agemenl process to satisfy Ihe demonstrated needs of proj- 
ect teams. Finally, the enhancements provided in version 2.0 
of DMS t hat have evok ed over several releases reflect the 
maturity of ihe product and the relative stability of Ihe 
feature set. 

Currently l»MS is in use at six IIP divisions on four sites. 
More than seven hundred users in R&D, QA, and technical 
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Fig. 14. This report shows how the ratio of defect find rate to defect 
fix rate can be used to track ongoing project progress. 

support have logged over 7500 defects against 25 major 
projects in 18 months of use. 

Lessons Learned 

As anticipated in early design sessions, choosing an interface 
toolset with incomplete graphical user interface capabilities 
proved to be a significant hurdle for many users. In its cur- 
rent form, DMS employs a character-based windowing 
scheme that runs in both XI 1 and ASCII keyboard environ- 
ments. While tliis interface style maximizes connectivity, 
allowing virtually anyone with LAN access to use DMS, it 
has proven to be a tough sell to R&D users who expect tools 
to exhibit an OSF/Motif look and feel. As a result of the deci- 
sion to trade off connectivity for XI 1 and OSF/Motif support, 
more non-Xl 1 users have access to DMS at the expense of 
XI 1 users who are inconvenienced by a more primitive 
interface. 

While an evolutionary delivery can be used to deliver just-in- 
time functionality to users, one cannot underestimate the 
importance of user involvement in making design decisions 
and prioritizing implementation tasks. The users group and 
steering committee proved to be successful tools in guiding 
the evolution of DMS. The users group is an open forum that 




Fig. 15. Tliis report, tracks open defect counts against estimates of 
projected open defects necessary to meet a scheduled completion 
deadline. 

allows communication between users and developers. The 
steering committee consists of a group of expert users who 
ensure that the tool evolves in a consistent direction. 

Conclusion 

DMS has achieved its objective as an "industrial-strengt h" 
defect management solution. Since its introduction, it has 
been used to manage defect information for many flagship 
products over many divisions. It has proven itself as a 
24-hour-a-day workhorse, serving as many as 40 to 50 simul- 
taneous users during normal business hours. In fact, the 
real-time reliance on DMS has necessitated scheduled main- 
tenance during late night and weekend hours. DMS enabled 
us to extend its contribution into the R&D community by 
prov iding the services of a self-contained software process 
tool with minimum administration. As a result, DMS has 
increased the scale at which we can provide defect tracking 
services without incurring significant personnel increases. 




Fig. 16. This report shows over- 
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Fig. 17. This report gives a measure of overall software quality 
(lnigweighl) over Unit'. 

Another major success for DMS is the degree to which re- 
lated projects can now share defect information. Based on 
the immediate acceptance and use of the defect-sharing ca- 
pabilities of DMS. users now actively I real defect informa- 
tion as sharahle, and are conscientiously communicating 
with other projects via this mechanism. For niultisite opera- 
tion, DMS successfully demonstrated the ability to cross- 
submit and cross-track over multiple sites transparently. By 
making use of passive server-to-server communication 
mechanisms, DMS servers at different sites can easily lie 
configured to communicate defect information. 

An area where DMS has greatly assisted R&D management is 
in metrics analysis. Using SQL, raw defect data is much more 
readily analyzed and convened into a digestible form. The 
rale at which more extensive ad hoc analyses can be deliv- 
ered has greatly increased hi addition, protecting process 
and data integrity has enhanced the accuracy and reliability 
of all queries, ad hoc or standard, 

Timeliness has been another major success of DMS. As can 
be seen in Fig. 19. DMS has succeeded in shortening much 
of the defect life cycle time from an average of days to hours. 
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Fig. 18. This report provides a cumulative segmented view of 
unresolved defects liy severity. 
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Fig. 19. The difference between Hie defect lire cycles before ami 
after DMS. This shows the elapsed time between defect find date 
and engineer assign dale. 

In addition to the direct benefits DMS has provided, we have 
observed some interesting cultural shifts in the R&D and test 
communities in the printer divisions at our site. Project team 
members have come to place great reliance in the ability to 
get instantaneous defect information. The rigorous process 
imposed by DMS ensures that all defects contain the mini- 
mally required set of information and that all or the informa- 
tion has been validated against centrally maintained lookup 
tables. Users have come to appreciate the ease with which 
defect information can be located and manipulated 24 hours 
a day, seven days a week. 

DMS has also empowered users to manage defects within a 
proven process model. Given that the number of projects 
and the defects they generate will continue to increase, il is 
clear that the number of individuals needed to move a de- 
fect through the process needed to be minimized. DMS has 
been successful in that it has encapsulated a defect manage- 
ment process known to work for the laser printer firmware 
development process and has decreased the number of indi- 
viduals needed to manage defect information. DMS allows 
project teams to manage all aspects of the defect process 
without the need for the intervention of defect tracking 
experts or other outside agencies. 

In an unanticipated use. DMS has allowed us to share defects 
with third-party software vendors and still maintain security 
of internal defect information through the passive server data 
exchange mechanism. We have successfully used this mecha- 
nism to import and track defects that originated from a pub- 
lic DMS database server that was accessible to third-party 
engineers. This capability has generated much interest from 
project teams that use DMS and have extensive interaction 
with third-party software vendors. 
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Realizing Productivity Gains with C++ 



Although C++ contains many features for supporting highly productive 
software development, some characteristics of this object-oriented 
programming language tend to slow the realization of these 
productivity gains. 

by Timothy C. O'Konski 



In many cases there is a long delay between starting to work 
in the C++ language and realizing the potential productivity 
gains of the object -oriented paradigm, including code reuse. 
Before this delay can be shortened or eliminated entirely, 
practical issues relating to the multiparadigm nature of C++ 
and its C ancestry must be understood. 

The object-oriented benefits of" data abstraction and inheri- 
tance coupled with type checking give C++ a natural advan- 
tage when attempting to build both system and application 
software. Additional productivity gains can be obtained by 
reusing a class library if the following considerations can 
be met: 

• Programming mechanisms contained within the class library 
must be understood by the programmer before they can be 
expanded and reused correctly. 

• When selecting a C++ library class or class template, the 
size, performance, and quality characteristics of each class 
or class template component must be apparent to the 
programmer. 

• Appropriate class or class template definitions must first 
be properly located by the programmer so they can be in- 
corporated into the program currently under development. 

• The time taken by the programmer to learn how to use the 
library correctly must be much less than the lime necessary 
for the programmer lo create new code. Otherwise, the pro- 
grammer iiiighi attempt to rewrite the C++ library, inhibiting 
the productivity gain. 

This paper describes our experiences with developing new 
C++ software and modifying existing C++ libraries. It also 
looks at possible uses of templates and exception handling 
defined in the new emerging ANSI C++ standard X.'UMi. 

Mixing Programming Paradigms 

The following discussion describes some standard C++ 
programming paradigms and llieir associated problems. 

Concrete Data Types. < toncrete data types are the representa- 
tion of new user-defined data types. These user-defined data 
types supplement the C++ built-in data types such as inte- 
gers and characters lo provide new atomic building blocks 
for a C + + program. All Ihe operations (i.e.. member func- 
tions) essential for the support of a user-defined data lype 
.ire provided in Ihe concrete class definition. For example, 
types such as complex, date, and character strings could all 
be concrete data types which (by definition) could be used 
as building blocks to create objects in Ihe user's application. 



The following code shows portions of a concrete class 
called date, which is responsible for constructing the basic 
data structure for the object dale. 

typedef boolean int; 
#define TRUE I 
--define FALSE 0 

class date ( 
public: 

date lint month, int day. int year); //Constructor 
-dated; //Destructor 

boolean set_date(int month, int day, int year); 

// Additional member functions could go here.. . 

private: 

int year; 

intnumerical_date; 

// Additional data members could go here 

); 

Designers of concrete dala types must ensure that users of 
this class will not want lo add functionality to the class 
through derivation, Otherwise, the class must be designed lo 
handle incremental additions in advance. Failing to do so 
COUld Cause an ill-defined set of functions (for example, a 
missing assignment or copy constructor 1 ) which in turn 
would cause a defect lo be uncovered by unsuspecting users 
of the concrete data type. 

Abstract Data Types. Abstract data types represent the inter- 
face to more than one implementation of a common, usually 
complicated concept. Because an abstract dala lype is a base 
class lo more than one derived class, it must contain at least 
one pure virtual function. ( Ibjeds of this lype can only be 
created through derivation in which the pure virtual Function 
implementation is filled in by the derived classes. 

The following is an example of an abstract base class: 

class polygon { 
public: 

// constructor, destructor and other member functions 
// could go here... 

virtual void rotate (int i) = 0; //a pure virtual function 
// other functions go here... 

}; 
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Glossary 



Although the terminology associated with object-oriented piogramrnmg and C++ 
has become reasonably standardized, some object-oriented terms may be slightly 
different depending on the implementation Therefore, brief definitions of some of 
the terminology used in fhis paper are given below Foi more information on these 
terms see the references in the accompanying article 

Base Class. lo reuse the member functions and member data structures of an 
existing class. C++ provides a technique called class derivation in which a new 
class can derive the functions and data representation from an old class The old 
class is referred to as a base class since it is a foundation lor basel for other 
classes, and the new class is called a derived class Equivalent terminology refers 
lu the base class as the superclass and the derived class as the subclass 

Catch Block. One lor morel catch statements follow a try block and provide 
exception-handling code to be executed when one lor more) exceptions are 
thrown Caught exceptions can be rethrown via another throw statement within 
ttie catch block 

Class. A class is a user-delined type that specifies the type and structure of the 
information needed to create an ob|ect lor instance) of the class 

Constructors. A constructor creates an object, performing initialization on both 
stack-based and free-storage allocated obiects. Constructors can be overloaded, 
but they cannot be virtual or static C++ constructors cannot specify a return type, 
not even void. 

Derived Class. A class that is derived from one (or morel base classes 

Destructors. A destructor effectively turns an object back into raw memory A 
destructor takes no arguments, and no return type can be specified (not even 
void). However, destructors can be virtual 

Exception Handling. Exception handling, which is a feature defined in the ANSI 
X3J16 Draft and implemented in HP's 3.0 C++ compiler, provides language support 
for synchronous event handling This feature is not the same as existing asynchro- 
nous mechanisms such as signals which are supported by the underlying environ- 
ment The C++ exception handling mechanism is supported by the throw statement, 
try blocks, and catch blocks 



Other classes, such as square, triangle, and trapezoid, can he 
derived from polygon, and the rotate function can be filled 
in and defined in any of these derived classes. Note that 
polygon objects cannol lie constructed. The C++ compiler 
will prevent this from happening because there is at leasi 
one pure virtual member function not yet defined. 

Abstract data types sometimes suffer from too many func- 
tions being declared virtual This adds both size and some 
slight overhead to the program's speed of execution. Inlining 
will usually compensate for the speed overhead incurred by 
a virtual function, but will add even more to the size of the 
program or library object file. 

Node Classes. Node classes an- viewed as a foundation 
class component upon which derived classes can be built. 
Designed t o be part of a hierarchy, a node class relies on 
services from other base classes and provides some unique 
services itself. A node class defines any virtual functions 
necessary to change the object's behavior or fill in any pure 
virtual function definitions still left undefined in the derived 
class. Additional functions are also added by a node class to 
widen the behavior of an object. Node classes, by their very 
nature, will not suffer the fate of misconstrued concrete 
data types described above, but may suffer from some 

programming errors. 



Member Functions. Member functions are associated with a specific object of a 
class Thai is. they operate on the data members of an object Member functions 
are always declared within a class declaration Member functions are sometimes 
referred to as methods. 

Multiple Inheritance. A derived class can be derived directly from one or more 
base classes Any member function ambiguities are resolved at compile time 

Object. Objects are created from a particular class definition and many objects 
can be associated with a particular class The objects associated with a class are 
sometimes called instances of the class Each object is an independent object 
with its own data and state. However, an object has the same data structure (but 
each ob|ect has its own copy of the data) and shares Ihe same member functions 
as all other objects of the same class and exhibits similar behavior For example, 
all the objects of a class lhat draws circles will draw circles when requested lo do 
so. but because of differences in the data in each object's data structures, the 
circles may be drawn in different sizes, colors, and locations depending on the 
state of the data members for that particular object 

Template. A class template provides a mechanism tor indicating those types that 
need lo change with each class instance The generic algorithm associated with 
the class remains invariant Later in the class instantiation, these types aie bound 
to built-in or user-defined types 

Throw Statement. A throw statement is part of the C++ exception handling mech- 
anism. A throw statement transfers control from the point of the program anomaly 
to an exception handler The exception handler catches the exception A throw 
statement takes place from within a try block, or from a function in the try block 

Try Block. A uy block defines a section of code in which an exception may be 
thrown. A try block is always followed one or more catch statements. Exceptions 
may also be thrown by functions called within the try block 

Virtual Functions. A virtual function enables the programmer to declare member 
functions in a base class that can be redefined by each derived class. Virtual 
functions provide dynamic (i.e.. run-time) binding depending on the type of object 



Common problems in declaring node classes stem from the 
fact that they are designed to be sources of object deriva- 
tion. Because of this, the presence of any virtual functions 
(in either the base or any derived classes of the node class) 
will require the presence of a virtual deslnnior lo ensure 
proper class cleanup. Because one cannot determine if and 
when a Virtual function might be added by a class deriva- 
tion, it is better to be safe and declare the destructor virtual 
in the base class. This is because Ihe "virtualness" of the 
destructor cannot be added in any derived class. It must be 
pari of the base class destructor declaration. 

An additional problem common to a node class is improper 
verification of protected data members. Because a derived 
class can modify or change protected data members, they 
could be invalidated by any derived class. Adding assert 
statements to a special "debug" version of the node class 
that validates the protected data can detect this type of 
programming error. 

Interface Classes. Enterface Classes are the most humble but 
important and overlooked of all classes. The purest form of 
an interface class doesn't cause any code to be generated. 
Consider the following unsafe class called List, which is 
Wrapped by the class template SafeList: 
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template<class T> 

class SafeList . private List<void*> ( 

public: 

void insertfT* pi { Ust<void*>: insertlp); ) 
void appendIT* pi { List<voidV:append(p); } 
T* get!) { return (T*) List<void*>::getl); } 

Here, a class template railed SafeList is used to convert an 
unsafe generic list of void* pointers into a more useful family 
of type-safe list classes.- Type-safe means that the compiler 
checks for correct pointer types instead of allowing any 
pointer (e.g.. void*) to be used within a list template. The 
very nature of a void* pointer is thai it may contain a pointer 
to any object. By adding the SafeList template, we are ensur- 
ing that a List template can only contain pointers to classes 
that we have defined for use with a List template. 

Interface classes are used to adjust an interface, provide 
robustness with a greater degree of type safety, or prevent 
member function names from two different class hierarchies 
from clashing. 

Handle Classes. Handle classes provide an effective separai ii m 
between an object interface and its implementation. Handle 
classes provide a level of indirection via pointers. Additional 
benefits include an interface to memory management and 
the ability tp rebind Iterators for a class representation. An 
iterator is a function thai returns the next element in a list, 
array, string, or any collection of items each time it is called. 

A handle class is a good candidate for a class template: 

template <class T> class handle_class { 

T* representation: 
public: 

T* operator->() { return representation:} 

II... 

); 

This code fragment shows how a handle class is used to 
manipulate pointers to objects of type T, instead of actual 
user-defined class representations of objects of type T. A 
problem with handle classes is that Ihey require cooperation 
between the class being handled and the handle class itself. 

C Roots 

The fact that C++ is based on the C programming language 
is evident throughout the language, C is retained as almost a 
subset of C++ and so is the emphasis on C facilities thai are 
low-level enough to deal with the most demanding system 
programming tasks. 

Once a class definition has been agreed upon by a program- 
ming team, the programmers have the ability to proceed with 
implementation by using member function code stubs when- 
ever necessary. This practice Of filling In stubs with real func- 
tion code when necessary in conjunction with C++'s static 
type checking enables a form of rapid prototyping via incre- 
mental development. 3 C++ allows iterative design and im- 
plementation to proceed in parallel, facilitating a more rigor- 
ous design methodology than Conventional G programming. 1 

Because of the (' roots of C++, most or all of t he low-level 
programming tasks thai arc within the range of C are still 
within the scope of C + + . However, some of the problems that 
have plagued C programmers also effect C++ programmers. 



The problems encountered in this regard include: uninitial- 
ized pointers, data narrowing, memory leaks, and conflicting 
'defines, typedefs. and enums, 

Uninitialized Pointers. An uninitialized pointer might contain 
a garbage address, and if used in its uninitialized state may 
cause the program to abort. 

int 'pint 

// other code ('pint is not initialized to any address! 
•pint = 9: // may result in a "Bus errorlcoredumpl" 

Data Narrowing. On a system where sizeoflshort) is two, the 
code: 

unsigned short s = 65535: 
signed inti = s; 

will silently change the value of s to be -1 when i is printed. 
This is because in the unsigned version, the high bit is used to 
increase the value of the unsigned short, while in the signed 
(i.e., inti) version, lite sign bit is used to signify a negative 
number. When the unsigned value s is assigned to the signed 
int value i, the number changes from a large unsigned value to 
a small negative value because its high-order bit is interpreted 
in a different maimer. 

Memory Leaks. When a local ion that contains a pointer to 
memory is deallocated, a "memory leak" occurs (just as in 
C), This means that the location thai contained the pointer 
to memory allocated out of free storage is no longer valid 
Thus, the allocated memory cannot be accessed for the 
duration of the program. For example, in the sequence: 

{ 

char *s = new charjlO]; 
// some code here... 

) 

// the variable s is no longer accessible 

the pointer s is out of scope and a memory leak will occur 
immediately after I his code segment if a delete s operation is 
not performed before the end of this code fragment 

^defines, typedefs, and enums. Problems wilh these declarations 
occur on a pcr-program basis when declared at file scoping. 
For example, 

//header file #1: 

typedef int boolean: 

//... 

<eof> 
and then: 

// header file #2: 

typedef unsigned boolean; 

//.- 

<eof> 

will cause the linker to issue an error and abort because of 
conflicting typedef declarations in two different source files. 

Typical Problems with Libraries 

The ANSI C++ Committee X&IHi and a parallel ISO ( Interna- 
tional Standards < Irganization ) committee are currently 
Standardizing the C + + language. < )ver the past six years the 
C++ language has continued lo evolve through five major 
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releases. This moving target has resulted in libraries anil 
programs thai typically have upgrades that accommodate 
the new language features without taking lull advantage of 
them.' 1 This means that the programmer must make deci- 
sions regarding which feature is the correct one to use with 
each new release of a class library. 

Requiring C+ + library users to be conversant with both the 
previous arid current C++ versions is a hardship on the C++ 
programmer. As a result some programmers have completely 
avoided new versions of C++ and stayed wit h the C++ release 
upon which their product Is based, This problem will sub- 
side significantly w hen the X3J1G committee work becomes 
solidified into a draft standard. 

The traditional object-oriented approach of using class deri- 
vation (i.e., inheritance) to reuse existing functionality is not 
necessarily the best way to make use of C++ classes to pro- 
vide a has-ii relationship as opposed lo the traditional inher- 
itance use to provide an is-a relationship. Is-a relationships 
are provided for by C++ via inheritance, which is commonly 
known as a class derivation. For example, if class B is 
derived from class A, 13 has all the characteristics of A plus 
some of its own, and we can safely say that B is-a kind of A. 
Ilas-a relationships are supported by composition rather 
than by inheritance. Composition is implemented by making 
one class a member of anol her class." For example, we have 
a has-a relationship if B is contained in A. 

It is still not clear whether to use multiple inherit mice to 

<• bine the features of two different class libraries Que., 

both via is-a relationships) into a new class. One school of 
thought argues that multiple inheritance gives the class de- 
signer much more flexibility than single-inheritance class 
relationships. °- 7 ' 8 

Classes that incorporate the new exception handling mecha- 
nism (described below) and also reside in multiple libraries 
do not yet exist on the marketplace; Therefore, conclusive 
evidence regarding the utility of multiple inheritance as a 
language feature to be used when combining classes from 
multiple libraries cannot be constructed until such C+-+ 
libraries exist and are successfully reused. 

Templates 

Templates provide a type-safe way of creating what is essen- 
tially a macro-like textual substitution mechanism and manip- 
ulating different types in a generic fashion. Templates provide 
a way to define those types that need to change with each 
class instance. Templates are created by parameterizing 
types in a class definition. These parameters act as place- 
holders until they are bound to actual types such as int. 
double, short, and char. For example, in the following code 
fragment, which is a template for an array class, Alphanum is 
the parameterized type. 

//... 

const int arraysize = 16; 
template <class Alphanum> 
class array ( 
public: 
arraydnt sz=arraysize) 



(size=sz; indx=new Alphanum [size]; 1 
virtual -array!) {delete indx, ) 

//... 

private: 
int size; 

Alphanum "indx; 

II... 

): 

When this template is used, objects of the array class might 
look like: 

mam () 
{ 

array <int> intx (2); // integer objects. . . 

array <double> doublex (2); // double objects. . . 

array <char> charx (2); // character objects... 

) 

This shows that the actual type is substituted for the generic 
Alphanum defined in the template. 

Using template classes creates a need for specific configura- 
tion management and tool support. 1 ' Additionally, template 
syntax is complicated and makes the code more difficult for 
others to understand. Tool support is needed to help cover 
the template syntax issues and for manipulating the interac- 
tions between templates, classes, and exceptions. 

Exceptions 

An exception is an event that occurs during program execu- 
tion that the program is typically not prepared to handle. This 
event usually results in the program transferring control to 
another part Of the program (exception handler) that can 
handle the event. Exception handling is necessary for robust, 
reusable libraries. Since exceptions may cause resources to 
be released in an unexpected manner, acquisition of re- 
sources and appropriate cleanup is a new requirement on 
class libraries. The typical mechanism of acquisition and re- 
lease of files can easily be handled by using object construc- 
tors and destructors as shown in the following example. 

class FilePtr ( 

FILE* p; // declare pointer to a file... 
public: 

FilePtrlconst char" n, const char* a) //class constructor 
{p = topen(n. a);} 

FilePtr(FILE* pp) { p = pp; } 

-FilePtrO { fcloselpl; } // class destructor closes file... 

operator FILE* II ( return p; } 

}; 

With this object class, a file pointer p can be constructed 
with either a File" or the arguments required for an iopenll. In 
either case FilePtr will he destroyed at the end of its scope 
and the destructor will close the file. A simple example of 
this programming technique would look like: 

void usejilelconst char" name) 
i 

FilePtr flname, V); 

// use f . . . 

} 
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The destructor will be called regardless of whether the 
function is exited normally or an exception occurs. 

Other C*4 language issues that need lo be considered when 
reusing C++ libraries that incorporate exceptions include: 

• Converting existing libraries to handle exceptions properly 

• Remapping unexpected! I and terminate!) functions 

• Combining the exceptions of one library with those of 
another library in a single program 

• Handling asynchronous events (e.g.. signals) and 
synchronous C++ exceptions simultaneously. 

The following example program shows how multiple threads 
of control, which are represented by HP-UX* asynchronous 
signals and synchronous C+ + exceptions, do not work to- 
gether simultaneously. The second throw statement (re- 
throw) in the myhandler portion of the code, which tries to 
transfer control outside the exception handler, will not work 
at this time. The compiler cannot detect this condition be- 
cause of the possibility of separate compilation of the signal- 
handler code and the code that traps to the signal handler. 

'include <unistd.h> 
'include <stream.h> 
'include <signal.h> 

r 

* types needed below (used to be in <signal.h>, but were removed in 

* HP-UX 8.0) 

* Note: This program (and the other cade in this article) was compiled on an 

* HP 9000 Model 730 running HP-UX A.08.07 using HP C++ A.03.00 

7 

typedei void SIG_FUNCJ"YP(int), r for UNIX* System V compatibility */ 
typedet SIG_FUNC_TYP "SIG.TYP; 
'define SIG_PF SIG.TYP 

int i = 0; 

/• The function myhandler is called when the SIGINT is detected by the 
" program; after which a "sleep" and then a "throw" is performed (i.e., in 
" a synchronous manner). PLEASE NOTE: This signal handler could reside 

* in a separate compilation unit, making it impossible for a compiler to 
" check for this error condition. 

7 

void myhandlerd 
I 

try{ 

(void) signal (SIGINT, (SIG_PF| myhandler); 
cout « "in myhandler now. . \n" « flush; 
sleep(l), 

throw I; // error: NO throws allowed in signal handlers if they 
// are not caught in the signal handler 

} 

catch (inti|{ 

cout « "catch inside myhandler now...\n" « flush; 
throw, //this is an error because rethrows lor throws) 

// are not allowed to propagate outside a signal handler 

} 

1 

/" This main program waits for a SIG_PF, (i.e., usually CTRL/C) 

* which causes a core dump because of the throw propagation restriction 

* mentioned above. This mixture of asynchronous signals and the 

" synchronous exception handling causes C++ to exhibit a routine failure. 



* Therefore, this construct should be avoided. 

7 

int main I) 
i 

cout « "starting the program..\n" « flush; 

// Arm our signal handler. . . 

Ivoidl signal (SIGINT, |SIG_PF> myhandleri, 

// forEVER loop.. 

fori;;) 

{ 

try { // Now that we are in a try block, let's throw something. . . 
throw t; 

} 

catch (imi) 

{ // Now we're in the catch block, so let us notify the user and 
//sleep for a moment... 

cout« "in main catch now...\n" « flush; 
sleep(l); 

} 

) 

// we'll never get here, but for completeness . 
return 0; 

I 

Conclusion 

C++ is an effective language for promoting both incremental 
development and code reuse. The additional capabilities of 
templates and exceptions need to have more idioms formal- 
ized for their proper use. Because of C++"s increasing com- 
plexity, stronger environmental suppoii is critical for the 
continuation of the language's success. 
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Bridging the Gap between Structured 
Analysis and Structured Design for 
Real-Time Systems 

A real-time software design technique has been applied to the design of 
the software architecture for ultrasound imaging products. 

by Joseph M. Luszcz and Daniel G. Maier 



Structured analysis (SA) and structured design (SD) are 
(wo widely used methodologies for software development. 
Structured analysis specifies the functionality to be imple- 
mented by a software system, and structured design is used 
for partitioning a single task into a set of functional modules. 
See "Structured Analysis and Structured Design Refresher" 
on page 92 for a brief review of the structured analysis and 
Structured design notation and terminology used in this 
paper. 

When designing and implementing a software system repre- 
sented by a structured analysis model, it is usually necessary 
to partition the functionalily among a number of concurrent 
tasks to meet the timing constraints placed on the software 
system, hi addition, to achieve a design with the characteris- 
tics of low coupling and high cohesion, it is desirable to 
partition the functionality inlo objects or packages for data 
hiding. 

Although structured techniques provide designers with a 
methodology for partitioning a complex system into manage- 
able pieces for analysis and design, t here are some problems 
in making the transformation from SA to SI) for real-time 
systems design. For example, the transformation from SA to 
SD does not easily support concurrency. Processes need to 
be grouped into concurrent tasks before detailed design. 
Another example is related to object-oriented design. SA 
and SD do not strongly support producing well-encapsulated 
objects. 

Because of t hese problems a software dev eloper, after 
specifying a real-time system using SA techniques, is often 
not sure how to proceed to a design and typically resoits to 
ad hoc design techniques. A methodology is needed to help 
the designer bridge the gap between SA and SD. 

The AD ARTS Solution 

ADARTS (Ada-based Design Approach to Real-Time Sys- 
tems)* is a high-level design methodology that effectively 
fills the gap between SA and SD by providing a systematic 
means (called process steps) for partitioning an SA specifi- 
cation model into a set of tasks, packages (objects), and 
communication links, which can then be designed using SD. 

Although ADARTS uses Ada construcls. us use is not limned to Ada. ADAPTS was created bv 
a group ol companies called the Software Productivity Consortium ISPCI Two versions ot the 
ADARTS specification have been produced bv the consortium- This paper is based on the first 
version 3 



The deliverables from ADARTS include a set of high-level 
architectural diagrams and a set of specifications called 
component interface specifications for each task and pack- 
age. These deliverables are described later in this article. 
Fig. 1 shows a simple overview of how ADARTS fits inlo the 
software development process with SA and SI). The nota- 
tion and graphic symbols for the ADARTS diagram shown in 
Fig. 1 are described in more detail later in this paper. 

We have been using ADARTS for embedded soi l w are devel- 
opment at Imaging Systems Division (ISY) since early 1990. 
ADARTS helped us deal with the complexity inherent in the 
design of the ISY shared soft ware architecture, and we found 
il indispensable in turning SA models into realizable designs. 

While the ADARTS technique supports the synchronizing 
constructs inherent to the Ada programming language, it is 
not necessary to program in Ada to derive the majority of 
the benefits from ADARTS. We easily adapted the methodol- 
ogy and its diagramming terminology to a more conventional 
operating environment consisting of high-level language 
programs running under the control of a real-time operating 
system. At ISY, we develop software in the C language run- 
ning under the pSOS-68K operating system. However, our 
approach to using ADARTS works with any language. 

The diagramming notation used in ADARTS is based on the 
Buhr notation. 1 which is used to represent the lasks, pack- 
ages, and communication paths resulting from the design 
decisions made in ADARTS (see Fig. 2). This diagramming 
notation is supported in commercially available CASK tools. 

The rest of this paper gives a brief overview of the ADARTS 
methodology (defining the notation shown in Fig. 2 along 
the way) and then presents an example of our experience 
with using ADARTS for software architecture design. 

ADARTS Process Steps 

The ADARTS process inv olves following the steps listed 
below to create the ADARTS deliverables mentioned above 
and then using the deliverables to design and implement the 
system. 

1. Develop a real-time structured analysis specification, ami 
level the data flow diagrams to create a single (flat ) diagram 
from the hierarchical set of data flow diagrams in the original 
model. 
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(Structured Analysis) (ADARTS Architectural Design) (Structured Design) 

Fig. L The role of ADARTS in the software development process. 



2. Identify concurrent tasks by applying task struct tiring 
criteria, and determine the kind of communication and syn- 
chronization mechanisms for the dala and events passed 
between tasks. Task Structuring involves combining I hose 
processes that should be combined or keeping separate 
those processes that must be separate. 

:j. Identify packages (objects) by applying the package 
structuring criteria to produce well-encapsulated software 
objects. 

4. Add support tasks to provide required synchronization 
and message buffering services. An example of a support 
task is a task that synchronizes access to a data store thai 
is accessed by multiple tasks. 

5. Package the tasks (we skipped this Ada-specific 
requirement). 



G. Develop an NRL (Naval Research laboratory method) 
module hierarchy (we skipped this step). 

7. Define component interface specifications. 

8. Perform the detailed design of task and package internals 
using structured design or an alternative methodology. 

9. Implement (code) the tasks and packages. 

10. Store the completed components and design 
documentation in a ( reuse ) library. 

Note lhat the ADARTS process steps are all-inclusive, cover- 
ing the development Of the software system from conception 
to delivery. Steps 2 through 7 are the contributions of 
ADARTS that are above and beyond SA and SD. 




Structured Analysis and Structured Design Refresher 



Structured analysis and structured design concepts have been in use tor several 
years now. and thus the concepts, termmulogy. and graphic symbols are (airly well 
known The following are some very brief definitions of the SA and SO graphics 
symbols shown in Figs 1 and 2 and used in the accompanying article fig 3 shows 
the process flow from customer requirements to code when SA and SD ate used 
in software development References 1 and 2 in the accompanying article contain 
much more information about SA and SD techniques 

Structured Analysis Notation 

Structured analysis notation and methodology provide 

• A graphic and concise representation of software functionality 

• A technique for partitioning a problem into manageable pieces 

• A way to represent software functionality in a nonredundant fashion 

• A way to create a logical model of what the software system does rather than 
how to do it 
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Fig. 1. The essential elements ol structured analysis notation using bank transactions as an 
example. 



The following definitions are associated with the notation in Fig 1 

Data Flow Diagram. A data flow diagram is a network of related functions or 
processes and the corresponding data interfaces between these components The 
notations shown m Fig I that are used to depict a data flow diagram consist of 

• labeled arrows, which show the data Hows (information flow) between processes 

• Circles lor bubbles), which represent the processes o> transformations being 
modeled by the system 

• Two parallel lines, which represent data stores, or places where information can 
stored 

• Rectangles, which represent the data sources and destinations (sinks) in the system 
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Fig. 2. A structure chart showing the essential elements of structured design notation for the 
bank example 




Task Structuring Criteria 

Task struc turing criteria are rules thai guide the designer in 
combining SA process transformations (process bubbles or 
pSpecs) and control transformations i<> form concurrent 
tasks, while separating those transformations that need to 
be separate Into Independent tasks. These criteria reflect 
the same reasoning thai an experienced real-time system 
designer might use when deciding on a Concurrent task 
Structure. ADARTS organizes these criteria for Systematic 
application to software design. 

The following are ADARTS task strut-luring criteria for 
combining transformations to create concurrent tasks: 



• Sequential cohesion. Combine transformations that execute 
in sequence with other transformations, such as a state ma- 
chine and the processing that occurs on a state transition 
(see Fig. !!). 

• Temporal cohesion. Combine transformations that must run 
at the same time as Other transformations, such as trans- 
formal ions that must respond to the same event (intemtpl ) 
or the same time tick. Fig. I shows the transformations that 
take place when a sensor monitoring patient temperature 
senses an out-of-limits temperature. 

• Functional cohesion. Combine transformations that perform 
one or more related functions. These functions typically 
operate on the same data structure or same I/O device. For 
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Fig. 3. Ihe role of structured analysis and structured design in the software development process. 



Context Diagram. This is Ihe lop-level diagram which shows the environment in 
which ihe software system being designed is supposed to work. The context 
diagram consists of one circle and the sources and sinks in the system's operating 
environment. This diagram forms the basis from which the rest of the system is 
designed One or more levels of data flow diagrams can be derived from the con- 
text diagram Multiple levels are created by partitioning the processes in the data 
flow diagrams. 

Data Dictionary. The data dictionary is used to define all data flows and compo- 
nents of data flows and data stores It provides a single place to record information 
that is necessary to understand the data in the data flow diagram 

Process Specifications. When a process can no longer be partitioned, the 
resulting processes are called primitive processes. Process specifications, or 
pSpecs. are used to describe these primitive processes The notations commonly 
used in pSpecs include structured English, decision trees, and decision tables 

A data dictionary and process specifications are symbolically represented in Fig. 3. 



Structured Design 

Structured design is the process of refining the output from the structured analysis 
phase to design the module structure that will lead to a particular implementation 
of the software system The steps in the process include 

• Derive a representation of the program structure with a structure chart. While the 
structure chart can be created from any system specification, it is typically created 
from a flattened data flow diagram The structure chart consists of three basic 
features: boxes representing modules, arrows connecting the modules, and short 
arrows with circular tails representing data passed from one module to another 
(see Fig. 2) 

• Expand the high-level definition by identifying lower-level modules needed to 
carry out the higher-level functions. 

• Improve the representation by employing the design principles of cohesion and 
coupling. Coupling measures the degree to which modules depend on each other, 
and cohesion measures the degree to which elements within a single module are 
related to each other. 

• Complete the detailed module design by employing a procedural logic description 
such as a mimspecification (mSpecl. An mSpec is similar to a pSpec for processes 
in structure analysis, except this time each module is being documented. 



example, a process thai computes trip statistics contains 
functions that access a database of collected trip data and 
I hen compute t he required statistics (see Fig. 5). 

The Criteria for separating Iransfonnalions into independent 
fasks include: 

• Event dependency- Use a separate task for each transforma- 
tion or group of transformations dependent on: 
Device 1/0 constraints such as responding lo asynchronous 
I/O requests 

o User interlace constraints such as independent users or 
user interface sequential activities such as windows 
Periodic events (events I hat iniliale transformations al 
regular lime inlervals). 



• Conlrol Transformation. Use a separate task for each 
independent control transformation such as a state machine 
controller or a transformation that is enabled and disabled 
by a conlrol transformation. 

• Task Priority. Use separate tasks for time-critical or 
computation-intensive activities. 

• Multiprocessing. Use separate tasks for transformations 
that must execute on separate physical processors. 

ADARTS specifies the order in which task structuring criteria 
should be applied so thai the first criterion assigned to a 
transformation > s usually the predominant one. However, 
subsequent Criteria may contradict the original classifica- 
tion, and when thai happens the original decision should be 
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Fig. 3. An example ofsequtntial cohesion In which two transforma- 
tions thai occur in sequence arc combined into one task. 

reconsidered using good engineering judgment to dec ide 
which criterion is dominant 

Intertask Communication and Synchronization 

Once the task structure has been defined, the data and evenl 
interfaces between tasks must be determined. The ADARTS 
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Fig. 4. An example of temporal cohesion in which three processes 

activated by the same event arc combined into a single l ask. 
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Fig. 5. An example of functional cohesion in which two transforma- 
tions performing functions related Co trip data are combined into a 
single i ask. 

process prov ides guidelines for choosing how each dala and 
event flow will be passed between tasks: 

• Tightly coupled communication. This type of communiealion 
inv olves sending messages or events and then waiting for a 
response. A model of tightly coupled communication is 
shown in Fig. 6a. 

• Loosely coupled communiealion. This type of communica- 
tion is implemented by a message or evenl queue with no 
response. In the produc er-consumer model, the producer 
would continue to send messages to a queue without wailing 
for a response from Ihe consumer, which extracts messages 
from the queue at its own pace (see Fig. 6h). 

• Loosely coupled communiealion with multiple producers. 
This communication style is implemented by a FIFO buffer 
or a priority queue. This is Ihe case in which many produc- 
ers might try to communicate With one consumer at the 
same time (see Fig. 6c ). 

Each communiealion or synchronization How is represented 
in ADARTS architeclure diagrams by a distinguishing symbol, 
and each type of flow is implemented by a specific- mecha- 
nism within the run-time environment of the system being 
designed. 

Package Structuring Criteria 

The package structuring criteria are rules for creating pack' 
ages, or objects. The application of these criteria produces 
well-encapsulated software objects using Ihe concept of 
information hiding. The ADARTS package structuring pro- 
cess does not conlain many original ideas, bill represents a 
Compilation of existing ideas and strategies applied lo a new 
domain ( real-time systems ). These rules fall into one of the 
following categories: 
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Fig. 6. ADARTS diagrams modeling intertask communication and 
synchronization, (a) Tightly coupled communication, (b) Loosely 

coupled communication, (c) Loosely coupled fonununitiaion wii.li 

multiple producers. 

• Hardware hiding modules. These modules are used to en- 
capsulate parts of the virtual machine such as the operating 
system or communications mechanisms or interfaces (e.g., 
device drivers) to particular I/O devices (see fig. 7). 

• Data abstraction packages. Each structured analysis data 
store becomes the basis for a data abst ract ion package, 
which hides system behavior requirements or software 
design decisions associated with data (see l*'ig. 8). 

• Servers. Servers are passive modules that provide services 
for other tasks. Files servers and print servers are examples 
of these types of modules. 

Component Interface Specifications 

• oniponenl interface specifications (CIS) are textual de- 
scriptions of each ADARTS task and package containing 
information that is needed to inspect the high-level design 
and move to the detailed design and implementation Of that 
component. Each component interface Specification contains 
the name and type of the component, what the Component 
does and when it does it, the operations provided by the com- 
ponent (including individual access procedures or functions 
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Fig. 7. An example of a hardware hiding module showing the 
interface lo an inpul sensor. 



Structured Analysis Model 



Signal Signal 




Current Temperature 
Temperature Limits 



ADARTS Model 




Fig. 8. A model ofa data abstraction package, 



© Copr. 1949-1998 Hewlett-Packard Co. 



August 1893 Hewlett Packard Journal 95 



NAME: Scanning stale ob|6C1 
DESCRIPTION: 

The scanning_3tate_object takes a state stimulus as input and produces the 
new scanning state (or an error code) as output (in the form of the scanning- 
state events). The object also has the ability to go explicitly to a particular 
state. 

Errors will be explicitly listed in the state tables. This will make the state 
tables a better reflection of system operation. 

A script will be written to compile descriptions of the scanning-state tables 
into ROM data tables. The scanning-state tables for various system configu- 
rations should all be explicitly present in the source files, then compiled by 
the script to produce the C source for the state tables. 

DATA STORES: 

Scanning-state transition table. 

Initial scanning state - for use at power on/reset. 

OPERATIONS: 

Scnst initializel ) 

This entry point should be called at power-on/reset to set up the state 
transition table and initialize scanning state variables. 

Scnst_chg„sta1e (stimutus) 

"Stimulus" is one of an enumeration ol possible stimuli. This entry point 
chooses a new scan state based on the stimulus or produces an error 
code if the stimulus is not allowed in the current scan state. If the scan 
state is changed the new scan state is output as an event. 

Scnsl_golo__state (state) 

'State'' is one ol the possible scanning states. 
This is meant to be used by internal applications. It causes the scan- 
ning state to be changed to the indicated state. 

Fig. 9. An example of a Component interface specification. 

with parameter definitions), and the effects of the compo- 
nent's operations. Fig. 9 shows an example of a component 
interface specification. 

Experience with AD ARTS 

Architecture Design 

We used ADARTS in the design of a software architecture 
for new ultrasound imaging products. This project was di- 
vided into two parts. The first part dealt with the develop- 
ment of the system software that provides a framework for 
application development and all generic services required in 
the application domain. The second part of the project dealt 
with tlic development of the software applications that pro- 
vide the specific functionality required by a target system. 
Each of these two parts of the project used SA and 
ADARTS, but in slightly different ways. 

First, the architectural structure and system components 
were developed. The high-level functions of the architecture 
were specified using structured analysis. This specification 
treated all functionality in terms of the general processing 
flow required for any application, without defining the spe- 
cifics of any particular application. The specification model 
was validated by walking through test cases derived from a 
number of representative applic ations. 

After the SA specification was complete, the ADARTS 
technique was applied to design the task and package Struc- 
ture and identify the communication mechanisms to be 
used. Component interface specifications were created, de- 
tailing the interfaces and functionality of each system com- 
ponent, followed by the detailed design and implementation 
of each component. Fig. 10 illustrates a key part of the 
ADARTS process, in which functionality is grouped into 
coherent tasks and packages using the appropriate structur- 
ing criteria. The sacks drawn around the various groups of 



data How diagram elements in Fig. 10 show the application 
of the task and package structuring and intertask synchro- 
nization criteria described above. After several iterations 
these sacks were transformed into the simplified ADARTS 
architectural diagram shown in Fig. 1 1. The letters in Figs. 
1 0 and 1 1 show the correspondence between the two system 
representations. 

After the architecture was specified, designed, and imple- 
mented, attention was turned to the second part of the 
project. — development of applications to run within the new 
architecture. Once again, structured analysis was used for 
specification of the software. However, we enhanced the 
design step by adding supplemental criteria to guide the 
designers in allocating application functionality to pre- 
viously designed architectural components. For example, 
within certain architecture components places were left 
open to plug in application software that: 

• Processes a keystroke (see vk (virtual key) functions in 
Fig. 11) 

• Defines an application-specific parameter (see the agents 
data structure in Fig. 11) 

• Adjusts an application parameter value when other parame- 
ters it depends on change (see the check routines function in 
Fig. 11). 

The supplement al criteria helped the application designers 
determine where each aspect of the application functional- 
ity should reside within the architecture. The ADARTS 
methodology was thus used as a template for creating a 
more specific high-level design method. Detailed design for 
each component of the application then proceeded in the 
usual way. 

Package Design 

The degree to which the package structuring procedure of 
ADARTS was used during the project varied significantly, hi 
some cases, the structuring criteria were applied rigorously. 
For example, in the continuous loop review application, 
which is an ultrasound application that supports acquiring 
video images into memory and playing them back as contin- 
uous loops, the criteria were applied to a leveled data flow 
diagram, leading to a highly modular ADARTS design 
consisting of objects with cohesive operations. 

In many cases, ADARTS was used simply as a notation to 
show an object representation of a system's functionality. 
Some ADARTS designs were derived from a complete and 
leveled SA and others were derived from a high-level or 
abbreviated SA. Although the package structuring criteria 
were not explicitly used here, designers still applied the 
information hiding concepts recommended by ADARTS, For 
example, the Interpret Stimulus component shown in Fig. 1 1 , 
which encapsulates the user interface functionality of the 
system, is quite complex. The ADARTS design for this com- 
ponent, although not derived from a complete SA, is very 
useful for showing the interactions between the packages. 
The component interface specifications for this component 
made it easier to understand the individual packages 
contained in the component. 

Similarly, while the criteria developed to guide engineers in 
allocating specification functionality to architecture compo- 
nents were not always used explicitly, they communicated 
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Fig. 10. A leveled structured analysis representation of the software architecture for the ultrasound imaging products. 
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* Some of the places left open in packages for plugging in application software. 
•• The component interface specification shown in Fig. 9 is for this package. 

Fijj. 1 1. The \I >AKTS architectural diagram Obtained, after several design iterations. fri)tn the structured analysis diagram shown in Fig. 10. 



to designers the choices that had to be considered when 
assigning the functionality of the application being designed 
to the appropriate components. 

Tools and Techniques 

We used a commercially available CASE product to generate 
the ADA UTS diagrams. Since Ihe product was targeted for 



Ada users, we had to deal with several drawbacks in using 
the tool because we were using die C language and a real- 
time operating system. For example, the product provided 
no direct support for the message queue symbol (used to 
show loosely coupled communication between processes), 
so we constructed the required symbol from primitive line 
si i nclines. In addition, there was no integrated mechanism 
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for creating component interface specifications and tying 
them to AD ARTS components. Also lacking was a traceahil- 
ity mechanism for tracing system requirements from SA to 
AD ARTS to SD. 

Since the architecture software is a key part of the imaging 
product, we paid extra attention to using the development 
process to attain high-quality software. We adopted a gen- 
eral software development framework for the project. The 
steps in our process and the deliverables are summarized in 
Table I 

Table I 

Development Process Using ADARTS 



Phase 

Requirements General ion 

System Specification 
Architecture Design 
Detailed Design 
Implementation 



Deliverable 

Software Requirements 
Specification 

Structured Analysis 

ADARTS 

Structured Design 
Source Code 



Each step in this process was usually followed by an inspec- 
tion by the appropriate individual or group. 

Summary 

The following is a list of the strengths and weaknesses we 
found by using this version of ADARTS in our environment 

Strengths. Some of the contributions and positive aspects of 
the ADARTS technique include: 

Continuity and task Structure. ADARTS provides a linkage 
between an SA model and the detailed design of individual 
software modules by partitioning the specification into the 
optimal set of concurrent tasks and the appropriate commu- 
nication mechanisms between them SA and SI) alone do not 
aid in the design of the overall concurrent task structure. 
Package structure. ADARTS, through its package structur- 
ing criteria, provides a method for achieving a reasonable 
object structure lor the functionality represented by the SA 
model. SD alone, through ils transform analysis and trans- 
action analysis techniques.* is not effective for building 
encapsulated objects. Encapsulation is the predominant 
Object-Oriented design concept applied to our software 
development activities, and ADARTS supports this design 
aspect very well, 

Visibility. ADARTS design deliverables (architecture diagrams 
and component interface specifications) make a software 
design more visible, promoting more effective design inspec- 
tions and making design concepts Clear to other engineers 
who have a need to understand or maintain the software. 
Systematic approach. The steps used in ADARTS provide a 
systematic approach to system-level design, reducing the 
thrashing that can occur when following unstructured or ad 
hoc system design methods. 

Intuitive. ADARTS is easy to understand for new and expe- 
rienced software engineers and intuitive to those familiar 
wiih real lime software design. 



Transaction analysis is a design sirategy based on a study nl 'he transactions the system must 
plotless Iraiislotm analysis is a design strategy hased Ml the study ol the data Hows In a 
system and the Iranslnrnialions performed on thai data 



• Acceptance. ADARTS was accepted by the engineers using 
it at ISY. although their reasons varied widely. Each of the 
strengtiis stated above was cited by one or more engineers 
as the most valuable contribution of ADARTS. 

Weaknesses. Just as we found many strengths in the 
.ADARTS technique, we also found some weaknesses in us- 
ing ADARTS in our environment. These weaknesses include: 

• ( )bject orientation. ADARTS falls short in its support for 
several of the currently accepted object-oriented design 
characteristics. For example, there is no provision for 
defining object classes or inheritance. 

• Tools. Manipulating the architecture diagrams used in 
ADARTS (as well as the data flow diagrams and structure 
charts used in SA and SD) witii the currently available CASE 
tools is time-consuming and has been a frequent complaint 
from engineers using these methods. 

• Ada. We didn't have a need for the Ada-specific structures 
discussed in the ADARTS paper, and therefore we did not 
gain the full benefit inherent in the ADARTS methodology. 

Conclusion 

We found ADARTS to be an extremely effective technique 
for bridging the gap between a structured analysis specifica- 
tion and the structured design of the software modules that 
make up a software system. By providing a path between 
the two techniques, it makes both far more valuable than 
they would be otherwise. For structured analysis, the con- 
tribution to the definition of concurrent tasks and communi- 
cation mechanisms is indispensable, but even if there is no 
concurrency required, ADARTS helps in identifying an ob- 
ject structure before applying the next detailed design step. 
Even if ADARTS is used on an SA specification that requires 
neither concurrency nor objects, it produces the trivial-case 
high-level design consisting of a single task in a single pack- 
age which can then be constructed using SD. Thus, there is 
no harm in applying the technique to all designs. 
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